Read the last sentence first: it usually contains the deciding constraint (lowest cost, least ops, highest availability, most secure).
If multiple answers “work”, the best answer is typically the one with managed services + least operational overhead.
If the question mentions private subnets, assume no direct internet access unless NAT/endpoints are included.
If you see AccessDenied, separate concerns: identity policy vs resource policy vs KMS key policy.
Constraint-first elimination flow
flowchart TD
Q["Read the last sentence first"] --> C["Find the deciding constraint"]
C --> A["Availability / resilience?"]
C --> P["Performance / scale?"]
C --> S["Security / private access?"]
C --> K["Cost / least ops?"]
A --> E["Eliminate answers that miss that constraint"]
P --> E
S --> E
K --> E
E --> W["Choose the lowest-complexity answer that still works"]
What to notice:
many SAA-C03 mistakes happen before service selection because the real constraint was missed
the best answer is often the managed service or simpler architecture that still satisfies that constraint
Final 20-minute recall (exam day)
Cue -> best answer (pattern map)
If the question says…
Usually best answer
Private subnet needs S3/DynamoDB access
Gateway VPC endpoint
Many VPCs with transitive routing
Transit Gateway
Cross-account private service exposure
PrivateLink
Managed relational HA database
RDS/Aurora Multi-AZ
Read scaling for relational DB
Read replicas
Highly available shared Linux file storage
EFS
Cache hot reads / reduce DB load
ElastiCache
Global static + dynamic acceleration
CloudFront (HTTP) or Global Accelerator (TCP/UDP/static IP)
Event decoupling and buffering
SQS/SNS/EventBridge based on pattern
Strict RTO/RPO across Regions
Warm standby or active-active pattern
Must-memorize SAA defaults
Topic
Fast recall
Security groups vs NACLs
SG stateful, NACL stateless
NAT design
One NAT gateway per AZ for HA and to avoid cross-AZ dependency
S3 durability vs availability
Very high durability; availability depends on class and design
Multi-AZ vs Multi-Region
Multi-AZ for AZ failure; Multi-Region for regional resilience
IAM evaluation
Explicit deny overrides any allow
Last-minute traps
Choosing NAT when endpoint is the private/cheaper answer for S3/DynamoDB.
Confusing Multi-AZ (HA) with read replicas (read scale/DR).
Picking custom/self-managed stacks when managed service meets the requirement.
Ignoring data transfer and cross-AZ costs in architecture choices.
1) VPC and networking — patterns that win
Subnets, routing, and egress (defaults you get tested on)
Public subnet: route table has 0.0.0.0/0 → IGW.
Private subnet: no IGW route; outbound internet needs NAT Gateway (typically one per AZ for HA).
Avoid cross-AZ hairpinning (SPOF + cross-AZ charges): private subnets in AZ‑A should use NAT in AZ‑A.
Gateway vs Interface endpoints (huge cost + security lever)
Versioning + lifecycle rules (protect against deletes and ransomware scenarios).
CRR/SRR for replication needs (cross-region or same-region).
Object Lock (WORM) for immutability requirements.
Backups (fast picks)
EBS snapshots are incremental; copy snapshots cross-region for DR.
AWS Backup helps centralize backup policies across common services.
For databases, prefer managed backup features (RDS automated backups/snapshots).
5) Databases and caching — RDS/Aurora vs DynamoDB
RDS: Multi-AZ vs read replicas (classic SAA)
Feature
Multi-AZ
Read replica
Primary purpose
HA/failover
Read scaling
Writes
One primary
Still one primary
Failover
Automatic
Manual promotion (generally)
Rule: Multi-AZ is about availability; read replicas are about scale.
Aurora (why it’s often a “best answer”)
Higher throughput than standard RDS engines (common exam framing).
Multiple read replicas for read scaling (and faster reads in the same region).
Aurora Global Database for low-latency global reads and faster cross-region DR.
When to choose what (fast)
Need
Best-fit
Relational + joins + transactions
RDS/Aurora
Massive key-value scale
DynamoDB
Sub-millisecond cache
ElastiCache
DynamoDB read cache
DAX
ElastiCache: Redis vs Memcached (SAA-level)
Service
Best for
Notes
Redis
Rich features + durability options
Replication, multi-AZ patterns, data structures
Memcached
Simple cache
Very simple, no persistence
DynamoDB: what wins questions
Prefer Query over Scan.
Choose partition keys to avoid hot partitions.
Use GSIs for new access patterns.
Use On‑Demand for spiky traffic; Provisioned + Auto Scaling for steady predictable workloads.
6) Resilience and DR — RTO/RPO patterns
HA patterns (default architecture)
Multi-AZ for app tiers (ALB + ASG across AZs).
Databases: Multi-AZ where required (RDS/Aurora).
Use queues/caching to absorb spikes and failures.
DR strategies (know the table)
Strategy
Typical RTO
Typical RPO
Cost
Notes
Backup/Restore
High
Hours
Low
Cheapest; slowest recovery
Pilot Light
Medium
Minutes–hours
Med
Minimal core in DR
Warm Standby
Low
Minutes
Med+
Scaled-down prod running
Multi-site active-active
Very low
Seconds
High
Complex; highest cost
Multi-Region data options (high yield)
Data layer
Multi-Region option
S3
CRR
DynamoDB
Global tables
Aurora
Aurora Global Database
DR routing: Route 53 policy selection
Routing policy
Best for
Failover
Active-passive DR
Weighted
Canary / migrations
Latency
Lowest latency routing per user
Geolocation
Compliance/content by country
Exam cue: If you need faster failover with static anycast IPs, consider Global Accelerator. If you need caching + origin failover for HTTP(S), consider CloudFront.
Active-passive vs active-active (quick framing)
Pattern
Pros
Cons
Active-passive
Cheaper; simpler operations
Higher RTO; failover/failback steps
Active-active
Lowest downtime; global performance
Most complex; highest cost
DR sketch (active-passive)
flowchart LR
Users --> R53[Route 53]
R53 -->|Primary| A[Region A]
R53 -->|Failover| B[Region B]
A --> AppA[App + DB]
B --> AppB[Warm standby]
7) Observability and operations — what each tool answers
Service
Think “this answers…”
CloudWatch
“How is it performing?” (metrics/logs/alarms)
CloudTrail
“Who did what?” (API audit trail)
Config
“What changed?” (config history + compliance)
X-Ray
“Where is latency?” (distributed traces)
Exam cue: if the requirement is auditing and investigations, CloudTrail is usually the anchor.
11) Create primary A/AAAA alias to ALB in Region A with Health Check.
22) Create secondary A/AAAA alias to ALB in Region B with Health Check.
33) Set routing policy to Failover: Primary / Secondary.
44) Verify health checks and simulate failover.
Final tip
If multiple answers work, pick the one that best matches the explicit constraint (for example: lowest cost or least operational effort) while still meeting availability and security requirements.