3.1 — Foundation

Virtual Machines

MS Docs

A VM gives you a full operating system in the cloud — you manage everything from OS up. Full control, full responsibility. Azure manages the physical hardware underneath.

📖 Virtual Machines — Complete Explanation

An Azure VM is an Infrastructure-as-a-Service (IaaS) compute resource. Microsoft manages the physical server, the virtualisation layer, the networking fabric, and the storage hardware. You manage everything from the operating system upwards — OS patches, runtime, middleware, application code, and data.

What happens when you create a VM? Azure allocates physical resources on a host server in the selected region. A hypervisor (Microsoft Hyper-V) creates an isolated virtual machine. Azure provisions managed disks for the OS and any data disks. A virtual NIC is created and attached to your chosen subnet. A private IP is assigned from the subnet's address range. The VM boots from the OS disk image.

Managed Disks are the storage backing for VM disks. They are separate Azure resources — you can see them in your resource group. This separation means you can stop/deallocate a VM without losing disk data, take disk snapshots independently, or attach a disk to a different VM.

The temporary disk (D:\) is physically located on the host server itself — it is NOT a managed disk. This provides very fast I/O but has a critical limitation: if the VM moves to a different host (resize, planned maintenance, deallocate/restart) the temp disk is completely wiped. Never store anything important on D:\ — use it only for the page file, swap space, or truly temporary scratch data.

VM sizes are combinations of vCPUs, RAM, temporary disk size, max data disk count, network bandwidth, and max IOPS. You pay for the size class per second when the VM is running (allocated). When deallocated, compute billing stops but you still pay for managed disk storage.

🏠
The Metaphor

VM = a private house. You own everything — walls, plumbing, electrical. Full control. You fix it when it breaks, you patch the roof, you mow the lawn.

App Service = a serviced apartment. You bring your furniture (code) — maintenance is handled. Can't knock walls down.

Functions = a hotel room by the hour. Pay only when you're there. Nothing is yours permanently.

VM Architecture — Components & Relationships
RESOURCE GROUP · RG-Kube-Prod Virtual Machine kube-vm-prod-01 · D4s_v3 · 4vCPU 16GB RAM OS Disk Persists ✓ 128GB Temp Disk (D:\) ⚠ Lost on deallocate Data Disks (0–64 attached) Persist ✓ Managed Disks (Ultra/Premium/Standard) NIC Private IP: 10.40.1.4 Public IP (optional) NSG attached here Public IP Separate resource Standard SKU · Static Availability Set → 99.95% SLA Zones → 99.99% SLA Set at creation only Extensions Custom Script Azure Monitor Agent Dependency Agent Managed Identity System or User-assigned VM sits inside a Subnet → inside a VNet → NSG filters traffic → UDR controls routing

VM Size Families

SeriesOptimised ForCPU:RAMUse Case
B-series (Burstable)Variable workloads — earns CPU credits idle, spends when busyVariableDev/test, low-traffic apps, CI/CD agents
D-seriesGeneral purpose — balanced CPU + memory1:4Web servers, APIs, most workloads — DEFAULT choice
E-seriesMemory optimised1:8SAP HANA, Redis, in-memory databases
F-seriesCompute optimised1:2Batch processing, gaming, CPU-heavy computation
N-seriesGPU enabledVariesML training, rendering, AI/computer vision
M-seriesMassive memory1:30+Largest SAP workloads, TBs of RAM
L-seriesStorage optimised — high disk throughput1:8NoSQL databases, data warehousing
📋
Exam Pattern — VM Series Selection
"High CPU, low memory" → F-series. "SAP HANA or Redis" → E-series. "ML/GPU training" → N-series. "Dev/test with low baseline but occasional spikes" → B-series. "Unknown/general" → D-series (always safe default).

Disk Types — Performance vs Cost

TypeLatencyMax IOPSUse Case
Ultra Disk<1ms160,000+SAP HANA, tier-1 databases — highest performance
Premium SSD v2~1ms80,000Production databases, I/O-sensitive workloads
Premium SSD~5ms20,000Production VMs, SQL Server — most common choice
Standard SSD~10ms6,000Dev/test, lightly used web servers
Standard HDDVariable2,000Infrequent access, backups, archival
⚠️
Temp Disk (D:\) — #1 Data Loss Scenario on Exam
The temporary disk is physically on the host server. It is wiped when the VM is deallocated, resized, or moved. Never store anything you want to keep on D:\. Use it only for: page file, swap, temp files. Application data, databases, logs → always on a data disk or OS disk.

Availability — Sets vs Zones

Availability Set vs Availability Zones — What Each Protects Against
Availability Set Same datacenter · 99.95% SLA FD 0 Rack 0 VM1 VM2 FD 1 Rack 1 VM3 FD 2 Rack 2 VM4 Protects: rack failure + planned maintenance ❌ Datacenter floods → ALL VMs affected Availability Zones Different datacenters · 99.99% SLA Zone 1 Datacenter A VM1 Zone 2 Datacenter B VM2 Zone 3 Datacenter C VM3 Protects: rack failure + datacenter failure ✓ One datacenter floods → other zones serve traffic
⚠️
Availability Set is configured AT CREATION ONLY
You cannot add an existing VM to an Availability Set after creation. You must delete and recreate the VM. Same for moving to Availability Zones. Plan this before deployment — not after.
OptionProtects AgainstSLAMin VMsLocation
No optionNothingNo SLA guarantee1Single host
Availability SetRack failure + planned maintenance (FDs + UDs)99.95%2+Same datacenter
Availability ZonesFull datacenter failure99.99%2+Different datacenters, same region
VMSS Flex (cross-zone)Both above + auto-scale99.99%2+Multiple zones

Stop vs Deallocate — Critical Billing Difference

STOP (OS shutdown)
❌ Still Billing
VM is off inside the OS.
Azure still reserves the hardware for you.
You are still charged for compute.
Public IP (if dynamic) stays assigned.

Like: turning off your office lights but still paying rent.
DEALLOCATE
✅ Stops Billing
VM is off AND Azure releases the hardware.
Compute charges stop (you still pay for disks).
Dynamic public IP released → may get new IP on restart.

Like: vacating the office — no rent owed.
Azure CLI — VM Power Operations
# WRONG — still charges you
az vm stop --resource-group RG-Prod --name kube-vm-01

# CORRECT — stops billing (except disk storage)
az vm deallocate --resource-group RG-Prod --name kube-vm-01

# Start VM again
az vm start --resource-group RG-Prod --name kube-vm-01

# Check power state
az vm get-instance-view \
  --resource-group RG-Prod \
  --name kube-vm-01 \
  --query "instanceView.statuses[1].displayStatus"

VM Scale Sets (VMSS)

🏭
The Metaphor

A VMSS is like a factory floor that automatically adds or removes workstations based on how busy it is. Rush hour → more workstations spin up. Quiet time → extra workstations spin down. All workstations are identical clones — same OS image, same config, same code.

VMSS — Autoscale Architecture
Internet Traffic Load Balancer distributes evenly VM 1 VM 2 VM 3 Azure Monitor CPU > 70% → scale out CPU < 30% → scale in Autoscale +VM or -VM cooldown: 5 min
Animated — VMSS Scale Out Flow: High CPU triggers new VM
VM 1 CPU 85% 🔴 Threshold: 70% alert fires Autoscale Engine Rule: CPU > 70% → Add 1 instance provision VM VM 2 Added ✓ Same image/config Ready in ~3-5 min Cooldown: 5 min No more scaling during cooldown Cooldown prevents thrashing — without it, scale-out → scale-in → scale-out loop would occur
📋
VMSS Key Facts for Exam
Default autoscale cooldown = 5 minutes (prevents thrashing).
Max instances per VMSS = 1,000 (100 with custom images).
Orchestration modes: Uniform (identical VMs, auto-repair) vs Flexible (mix VM types, cross-zone).
Scale-in policy: default removes the VM with highest instance ID. Can customise.
3.2 — PaaS Web Hosting

App Service

MS Docs

App Service is a fully managed PaaS hosting platform for web apps, APIs, and mobile backends. You deploy code — Azure handles OS patches, hardware, load balancing, and scaling.

🏢
The Metaphor

App Service is a fully serviced office suite. You bring your furniture (code) and move in. The building management (Azure) handles electricity, cleaning, security, lifts, and repairs. You cannot knock walls down (no OS-level access) but you also never worry about the boiler breaking.

App Service Plans — Tiers

TierAutoscaleSlotsVNet IntegrationAlways OnCustom Domain + SSL
Free (F1)
Shared (D1)
Basic (B1-B3)Manual only
Standard (S1-S3)✓ 10 instances✓ 5 slots
Premium (P1v3-P3v3)✓ 30 instances✓ 20 slots
Isolated (I1v2+)✓ 100 instances✓ Dedicated ASE
⚠️
Deployment Slots require Standard tier minimum
Free and Basic plans have NO deployment slots. Zero-downtime swap (staging → production) requires at least Standard. VNet Integration for private endpoint access requires Premium. This is a very common exam scenario.

Deployment Slots — Zero Downtime Deployment

Deployment Slots — Staging to Production Swap
Production Slot app.kube.com v1.0 — current live code 100% traffic PROD_DB = prod-cosmos.azure.com (sticky) Staging Slot app-staging.kube.com v2.0 — new code (test here) pre-warmed + tested PROD_DB = prod-cosmos.azure.com (sticky) SWAP (instantaneous) Sticky settings stay with slot — DB connection points to correct env after swap
Animated — Zero Downtime Deploy: Step by step
1. Deploy to Staging slot 2. Test on staging URL 3. SWAP <10 seconds 4. v2.0 Live ✓ zero downtime 5. Issues? Swap back! Old v1.0 now sits in staging — instant rollback available until you deploy next version
⚠️
Sticky Settings — Critical for Slot Swaps
After a swap, if production connects to the staging database — sticky settings were NOT configured. Settings marked as "Deployment slot setting" stay with their slot and do NOT swap. Always mark: DB connection strings, environment-specific app settings as sticky. Non-sticky settings travel with the code.
Azure CLI — Deployment Slots
# Create staging slot
az webapp deployment slot create \
  --resource-group RG-Prod \
  --name kube-webapp \
  --slot staging

# Deploy to staging (not production)
az webapp deployment source config-zip \
  --resource-group RG-Prod \
  --name kube-webapp \
  --slot staging \
  --src app.zip

# Swap staging → production (zero downtime)
az webapp deployment slot swap \
  --resource-group RG-Prod \
  --name kube-webapp \
  --slot staging \
  --target-slot production

# Mark app setting as sticky (stays with slot)
az webapp config appsettings set \
  --resource-group RG-Prod \
  --name kube-webapp \
  --slot-settings DB_CONNECTION=Server=prod-db;...

App Service Networking

FeatureDirectionWhat It DoesTier Required
VNet IntegrationOutboundApp reaches resources inside VNet (Private Endpoints, VMs)Premium+
Private EndpointInboundApp accessible only via private IP in VNet (no public)Standard+
Access RestrictionsInboundIP allowlist/blocklist on public endpointAll tiers
Hybrid ConnectionsOutboundReach on-premises resources without VNetBasic+
💡
VNet Integration vs Private Endpoint
VNet Integration = App reaches OUT into VNet. Needed to call Cosmos DB/Storage private endpoints.
Private Endpoint = Others reach INTO app via private IP. Hides app from public internet.
For Kube: Function App uses VNet Integration (outbound) to reach Cosmos DB private endpoint.
3.3 — Serverless

Azure Functions

MS Docs

Functions are event-driven, serverless compute. You write a function — it runs when triggered (HTTP request, queue message, timer, blob created). Pay only when executing. Azure handles everything else.

The Metaphor

Functions are like a light switch in a hotel corridor. Nobody pays for electricity when the corridor is empty. Someone walks in (trigger) → light turns on (function runs) → person leaves → light turns off. You pay only for the time the light was on. The hotel (Azure) maintains the wiring — you just define what the switch controls.

Hosting Plans — Critical Differences

PlanCold StartScale to ZeroVNet IntegrationMax TimeoutCost Model
Consumption⚠ Yes (1-10s)✓ Yes (free at zero)✗ No10 minPer execution + GB-s
Premium (EP1-EP3)✓ No (pre-warmed)✗ Min 1 instance✓ YesUnlimitedPer instance (flat rate)
Dedicated (App Svc)✓ No✗ No✓ YesUnlimitedApp Service plan cost
Container AppsConfigurable✓ Yes✓ YesUnlimitedPer vCPU/memory
⚠️
Private Endpoint Access Requires Premium Plan
Consumption plan Function Apps CANNOT use VNet Integration. If your Function App needs to reach a Cosmos DB private endpoint, Service Bus private endpoint, or any resource inside a VNet — you MUST use Premium plan. This is the most common Function App exam scenario.
📋
Cold Start — What It Is and How to Fix
Cold start: when a Function App on Consumption plan has been idle, the first request takes 1-10 seconds extra as Azure provisions the runtime.

Fix options:
1. Upgrade to Premium plan (always-warm pre-warmed instance)
2. Set minimum instances = 1 on Consumption (keeps 1 warm, still pay)
3. Use timer-based warm-up function to ping your own endpoint every 5 min
Azure CLI — Function App with Premium Plan
# Create Premium Function App with VNet Integration
az functionapp create \
  --resource-group RG-Prod \
  --name kube-func-prod \
  --storage-account kubestgprod \
  --plan kube-premium-plan \
  --runtime dotnet \
  --runtime-version 8 \
  --functions-version 4

# Enable VNet Integration (required for private endpoints)
az functionapp vnet-integration add \
  --resource-group RG-Prod \
  --name kube-func-prod \
  --vnet kube-vnet \
  --subnet func-subnet

# Assign Managed Identity (for accessing Key Vault, Cosmos etc.)
az functionapp identity assign \
  --resource-group RG-Prod \
  --name kube-func-prod

Durable Functions — Stateful Workflows

🔗
The Metaphor

Regular Functions are one-off tasks. Durable Functions are like a project manager coordinating a team. The manager (Orchestrator) assigns tasks, waits for results, then assigns next tasks based on outcomes — all without everyone being in a room at the same time.

RoleWhat It DoesRuns Once?
Client FunctionStarts the orchestration (receives HTTP, queue, etc.)Yes
Orchestrator FunctionDefines the workflow — calls activities in order, handles errorsReplays multiple times (must be deterministic)
Activity FunctionDoes the actual work (call API, write to DB, send email)Once per call
📋
Durable Functions Exam Scenarios
"Process IoT sensor data in multiple sequential steps without losing state" → Durable Functions.
"Human approval required mid-workflow (wait for days)" → Durable Functions with human interaction pattern.
"Fan-out: process 100 items in parallel then aggregate results" → Durable Functions fan-out/fan-in pattern.
3.4 — Containers

Containers — Docker, ACR, ACI, Container Apps

MS Docs

Containers package code + runtime + dependencies into a portable unit. Same container runs identically on any machine. Azure offers multiple container hosting options from simple single-container to full orchestration.

📦
The Metaphor

Docker Image = a shipping container blueprint (the mold). Read-only. Defines everything inside.
Docker Container = an actual container built from the blueprint. Running instance.
ACR (Azure Container Registry) = Azure's private DockerHub. Your company's secure registry to store and pull images.

Container Flow — Build to Run in Azure
Dockerfile defines layers FROM, RUN, COPY build Docker Image read-only layers kube-api:v2.0 push Azure Container Registry (ACR) kube.azurecr.io pull+run ACI (simple) Container Apps AKS (k8s) ACR Admin user OR Managed Identity ← preferred (no secrets) acrPull role on ACR

ACI — Azure Container Instances

Serverless Containers — No VM Management
Run a Container in Seconds — No Cluster Needed
ACI = the simplest way to run a container in Azure. No VM, no Kubernetes, no cluster. Just: az container create and it's running in ~30 seconds.

Restart policies:
Always (default) — restart on exit (long-running services)
Never — run once, exit when done (batch jobs)
OnFailure — restart only on error exit code

Use for: batch jobs, CI/CD tasks, short-lived tasks, dev/test. NOT for production web apps with autoscaling needs.

Container Apps vs AKS — The Key Difference

FeatureACIContainer AppsAKS
Kubernetes managed byN/AMicrosoft ✓You manage
Scale to zero✓ Yes✗ Nodes cost money
Traffic splitting (revisions)✓ Yes✓ Ingress
Event-driven scaling (KEDA)✓ Built-in✓ Add-on
AZ-104 exam tested?✓ Yes✓ YesBackground only
Best forSimple tasks, batchMicroservices, event processorsFull enterprise k8s
⚠️
AZ-104 Tests Container Apps — NOT AKS Deeply
AKS is covered at awareness level only in AZ-104. Container Apps is the exam-tested container service. Know: revisions (versions of your app), traffic splitting between revisions, KEDA event-driven scaling, and scale-to-zero behaviour. AKS deep knowledge is for AZ-204 (Developer) or AZ-305 (Architect).
AKS — Awareness Level for AZ-104
Azure Kubernetes Service — What You Need to Know
AKS = managed Kubernetes cluster. Microsoft manages the control plane (free). You manage worker nodes (you pay for VMs).

For AZ-104 exam — know these facts only:
• AKS uses system-assigned managed identity by default to pull from ACR
• Node pools can use Availability Zones for VM placement
Azure CNI vs Kubenet networking modes
• Integration with Azure Monitor + Log Analytics for container insights

Deep Kubernetes = AZ-204 / CKA certification territory.
3.5 — Infrastructure as Code

ARM Templates & Bicep

MS Docs

Infrastructure as Code — define Azure resources in files, deploy consistently every time. ARM = JSON format (verbose). Bicep = Microsoft's cleaner DSL that compiles to ARM.

📐
The Metaphor

ARM Template = architect's blueprint written in legalese. Precise, complete, but takes an expert to read.
Bicep = same blueprint rewritten in plain English. Same end result, much easier to write and maintain. A junior engineer can read it.

Both describe the same building — Bicep is just friendlier to the architect.

FeatureARM Template (JSON)Bicep
LanguageJSON (verbose, deeply nested)Bicep DSL (clean, readable)
dependsOnMust write manuallyAuto-detected from references
ModulesLinked templates (complex)Native modules (simple)
Compiles toItselfARM JSON (Azure only sees ARM)
What-if support
Microsoft recommendationLegacy (still supported)Current standard ✓

ARM Template (JSON)

ARM — Storage Account
{
  "$schema": "https://schema.management.azure.com/...",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "storageName": {
      "type": "string"
    }
  },
  "resources": [
    {
      "type": "Microsoft.Storage/storageAccounts",
      "apiVersion": "2023-01-01",
      "name": "[parameters('storageName')]",
      "location": "[resourceGroup().location]",
      "sku": { "name": "Standard_LRS" },
      "kind": "StorageV2"
    }
  ]
}

Bicep (Same Resource)

Bicep — Storage Account
// Much cleaner — same result
param storageName string

resource storageAccount 'Microsoft.Storage/storageAccounts@2023-01-01' = {
  name: storageName
  location: resourceGroup().location
  sku: {
    name: 'Standard_LRS'
  }
  kind: 'StorageV2'
}

// Output the endpoint
output blobEndpoint string = storageAccount.properties.primaryEndpoints.blob

Deployment Modes — CRITICAL EXAM TOPIC

ModeResources IN TemplateResources NOT in TemplateSafe?
Incremental (DEFAULT)Created or updatedLEFT ALONE ✓✓ Safe for production
Complete (DANGEROUS)Created or updatedDELETED ❌✗ Never without what-if
⚠️
Complete Mode Deleted My Cosmos DB
Complete mode deploys your template and deletes everything else in the Resource Group that is not in the template. Always run --what-if before any Complete mode deployment. In production, use Incremental always unless you deliberately want to remove resources not in template.
Azure CLI — Deploy ARM/Bicep with What-If
# Preview changes BEFORE deploying (always do this)
az deployment group what-if \
  --resource-group RG-Prod \
  --template-file main.bicep \
  --parameters storageName=kubestgprod

# Deploy Bicep (Incremental = default = SAFE)
az deployment group create \
  --resource-group RG-Prod \
  --template-file main.bicep \
  --parameters storageName=kubestgprod \
  --mode Incremental

# Share template with team (version it in Template Specs)
az ts create \
  --resource-group RG-Templates \
  --name kube-storage-template \
  --version "1.0" \
  --template-file main.bicep

# Deploy from Template Spec (team uses approved template)
az deployment group create \
  --resource-group RG-Prod \
  --template-spec /subscriptions/.../templateSpecs/kube-storage-template/versions/1.0
Exam Prep

Phase 3 — Exam Q&A

Exam Guide

Click to reveal. Real AZ-104 exam pattern questions from last 2–3 years.

QA VM was resized from D4s_v3 to E8s_v3. The application data stored on D:\ is now missing. Why?
ANSWER
D:\ is the temporary disk, which is local to the physical host. When a VM is resized it may move to a different host — temp disk is wiped.

The temp disk is physically on the hypervisor host server. It provides very fast I/O but is not durable. Always store persistent data on managed data disks (not D:\).

⚠️ Trap: Resizing a VM, deallocating, or platform maintenance can wipe the temp disk. This is guaranteed to appear on your exam.
QYou stopped a VM via the Azure portal Stop button but the bill still shows compute charges. Why and how do you fix it?
ANSWER
Stop only shuts down the OS but Azure still reserves the hardware. Use Deallocate to stop compute billing.

Portal: VM → Stop → this is Stop (billed). To deallocate: VM → Stop → check "Deallocate" option OR use CLI: az vm deallocate.

You will always pay for managed disk storage even when deallocated — only compute stops.
QA web app needs zero-downtime deployments with instant rollback capability. App Service Basic tier is currently used. What change is required and what is the process?
ANSWER
Upgrade to Standard tier (minimum for deployment slots). Then use staging slot + swap.

Process: Deploy new code to staging slot → test on staging URL → swap staging ↔ production (instantaneous, no downtime) → if issues: swap back immediately (old version is now in staging).

Mark DB connection strings as sticky settings so production always connects to production DB after swap.
QA Function App on Consumption plan cannot connect to a Cosmos DB behind a private endpoint. What is the solution?
ANSWER
Upgrade Function App to Premium plan and enable VNet Integration pointing to a subnet in the same VNet as the Cosmos DB private endpoint.

Steps: 1) Create Premium Function App plan. 2) Enable VNet Integration → select a dedicated subnet in your VNet. 3) Ensure private DNS zone for Cosmos is linked to that VNet. 4) Function App can now resolve private IP for Cosmos.

Consumption plan cannot use VNet Integration — this is an absolute platform limitation.
QAn ARM template deployment removes a Cosmos DB that was manually created in the Resource Group. What caused this?
ANSWER
Complete deployment mode was used. Complete mode deletes all resources in the Resource Group that are NOT defined in the template.

Fix: switch to Incremental mode (the default). Always run az deployment group what-if before deploying to see what will be created, modified, or deleted.

⚠️ Never use Complete mode without running what-if first. Even if only testing — production RGs may have resources you forgot about.
QWhich VM series would you choose for: (a) SAP HANA with 2TB RAM, (b) ML model training, (c) burst-only CI/CD pipeline agent?
ANSWER
(a) M-series — massive memory (TBs of RAM) for SAP HANA.
(b) N-series — GPU-enabled for ML training and AI workloads.
(c) B-series — burstable, earns CPU credits when idle, spends on burst. Perfect for CI/CD agents that run intermittently.
QContainer Apps vs ACI — when would you use each?
ANSWER
ACI: simple, short-lived tasks — batch jobs, one-off processing, dev/test. Run once, exit. No auto-scaling, no revisions.

Container Apps: production microservices — autoscale (including to zero), traffic splitting between revisions (A/B testing), KEDA event-driven scaling, VNet integration.

If the scenario mentions autoscaling, microservices, or traffic splitting → Container Apps. If it says "batch job, run and complete" → ACI.
Phase 3 — Cheat Sheet
Stop VM to stop billing completelyDEALLOCATE — not Stop (Stop still bills)
VM data lost after resizeWas on temp disk D:\ — local to host, wiped on resize/deallocate
Zero downtime web app deploymentDeployment Slots — requires Standard tier minimum
Sticky settings — what happens on swap?Stay with slot — DB connection points to correct env after swap
Function App → private endpointPremium plan + VNet Integration (Consumption cannot)
Function App cold start fixPremium plan OR set min instances = 1
Container: run and exit, batch jobACI with restart policy: Never
Container: autoscale, scale to zeroContainer Apps (not ACI, not AKS for exam)
ARM Complete mode deleted resourcesExpected — Complete deletes anything NOT in template. Use Incremental
Preview ARM changes before deployaz deployment group what-if
SAP HANA, 2TB RAMM-series VM
ML training, GPUN-series VM
Availability Set vs Availability ZonesSet = rack protection 99.95%. Zones = datacenter protection 99.99%
VMSS autoscale cooldown5 minutes default