Istio Certified Associate (ICA) Study Guide
The Istio Certified Associate (ICA) validates working knowledge of the Istio service mesh: its control-plane/data-plane architecture, traffic management, security (mTLS and authorization), and observability. It is a 90-minute, performance- and knowledge-based exam aimed at platform engineers, SREs, and developers who configure and operate Istio on Kubernetes. A scaled score of 750 is required to pass.
Domain 1: Istio Architecture
- Istio splits into a data plane (Envoy proxies that intercept and move all workload traffic) and a control plane (istiod, which handles service discovery, configuration distribution, and certificate issuance).
- istiod is a single consolidated binary that merged the older Pilot, Citadel, and Galley components into one control-plane process.
- Envoy sidecars intercept pod traffic transparently using iptables rules set up by an init container (istio-init) or the Istio CNI plugin, requiring no application code changes.
- Sidecar injection is enabled by labeling a namespace with istio-injection=enabled (or with istio.io/rev=<revision> for revision-based installs); a mutating admission webhook then injects the proxy at pod creation.
- Existing pods do not get a sidecar retroactively. After labeling a namespace you must restart or redeploy the pods (e.g. kubectl rollout restart) so the webhook fires.
- istioctl kube-inject -f deploy.yaml | kubectl apply -f - performs manual injection without relying on the namespace label / webhook.
- Ambient mesh is Istio's newer sidecar-less data-plane mode: a per-node ztunnel DaemonSet provides L4 connectivity and mTLS, and optional per-namespace/per-service waypoint proxies add L7 features.
- In ambient mode you only deploy waypoint proxies for the namespaces or service accounts that actually need L7 capabilities (routing, L7 authorization), keeping overhead low.
- Installation profiles tune what gets deployed: demo (full features, for learning), default (production baseline), minimal, and ambient. Example: istioctl install --set profile=demo -y.
- Canary/revision upgrades install a new istiod alongside the old one using --set revision=<rev> (e.g. revision=1-20-0), letting you migrate namespaces gradually before uninstalling the old revision.
- istioctl uninstall --purge -y removes all Istio control-plane components and CRDs from the cluster.
- Envoy sidecars cache their last-known configuration and certificates, so existing data-plane traffic keeps flowing even if istiod is temporarily down (you just cannot push new config during the outage).
- A Sidecar resource limits which hosts/namespaces a workload's proxy is configured to reach, reducing memory/CPU and config size by scoping egress to only the services it actually calls.
- Multi-cluster meshes enable endpoint discovery between clusters so each control plane learns remote service endpoints; multi-tenancy can be achieved by running separate istiod revisions and trust domains per tenant.
Domain 2: Traffic Management
- A VirtualService defines how requests are routed to a service (host matching, weighted splits, header/URI matches, retries, timeouts, fault injection, mirroring); a DestinationRule defines policies applied after routing (subsets, load balancing, connection pools, outlier detection, TLS).
- Subsets are declared in a DestinationRule under spec.subsets, each with a name and labels (e.g. version: v1); a VirtualService route then targets destination.host plus destination.subset.
- Weighted/canary routing is done with http.route[] entries each carrying a weight value (e.g. 90 to v1, 10 to v2); weights across a route must sum to 100.
- Header-based routing uses http.match[].headers (e.g. end-user.exact: tester). Match rules are evaluated top to bottom, so put specific matches (like the header route to v2) before the catch-all weighted default.
- Retries are configured with http.retries: { attempts, perTryTimeout, retryOn } (e.g. attempts: 3, perTryTimeout: 2s, retryOn: "5xx,connect-failure").
- Timeouts are set per-route with http.timeout; this is the overall request timeout, distinct from a retry's perTryTimeout.
- Fault injection is defined in a VirtualService: http.fault.delay (fixedDelay plus percentage) injects latency, and http.fault.abort returns synthetic HTTP error codes - both used for resilience testing.
- Traffic mirroring (shadowing) copies live requests to another destination via http.mirror plus mirrorPercentage.value; mirrored traffic is fire-and-forget and the response from the mirror is ignored.
- A Gateway resource configures an Istio ingress/egress proxy at the mesh edge (ports, protocols, TLS); it must be bound to a VirtualService to actually route traffic to backend services.
- Server TLS on a Gateway uses tls.mode: SIMPLE (or MUTUAL) with credentialName referencing a Kubernetes TLS secret holding the cert and key.
- Load-balancing policy lives in a DestinationRule trafficPolicy.loadBalancer: ROUND_ROBIN, LEAST_REQUEST, RANDOM, or consistentHash (using httpHeaderName, httpCookie, or useSourceIp for session affinity).
- Circuit breaking combines DestinationRule connectionPool limits (max connections/requests) with outlierDetection, which ejects hosts returning consecutive errors (e.g. consecutive5xxErrors) for a baseEjectionTime.
- External (off-mesh) destinations require a ServiceEntry to register them in the mesh registry; controlling egress to those hosts may also use an egress Gateway.
- A typical canary workflow: define v1/v2 subsets in a DestinationRule, start a VirtualService at 95/5, gradually shift weight, then move 100% to v2 once healthy.
Domain 3: Security
- Istio assigns each workload a SPIFFE identity derived from its Kubernetes service account, formatted as spiffe://<trust-domain>/ns/<namespace>/sa/<serviceaccount> (e.g. cluster.local/ns/foo/sa/bar).
- istiod issues short-lived X.509 certificates that the istio-agent in each pod automatically rotates, so there is no manual certificate management.
- Mutual TLS encrypts and mutually authenticates service-to-service traffic between sidecars; it is configured with a PeerAuthentication resource.
- PeerAuthentication mTLS modes: STRICT (only mTLS accepted), PERMISSIVE (accepts both plaintext and mTLS - the default, used during migration), and DISABLE.
- A mesh-wide STRICT policy is a PeerAuthentication in the root namespace (istio-system) with no selector and mtls.mode: STRICT; it can be overridden by namespace- or workload-scoped policies.
- During migration keep PERMISSIVE so non-injected/legacy clients keep working, then tighten to STRICT once every client is in the mesh; you can override specific legacy namespaces back to PERMISSIVE.
- portLevelMtls lets you override the mTLS mode for a specific port (e.g. map a port to mode: DISABLE) while the rest of the workload stays STRICT.
- AuthorizationPolicy enforces access control with action ALLOW (default), DENY, AUDIT, or CUSTOM; rules combine from (sources), to (operations like paths/methods), and when (conditions).
- A deny-all policy is an AuthorizationPolicy with action: ALLOW and empty spec (spec: {}), which allows nothing; alternatively action: DENY with rules: [{}] denies everything in scope.
- Authorization rules match on source identity via from.source.principals (the SPIFFE/service-account identity) and on namespaces, IP blocks, request paths, and HTTP methods.
- End-user (JWT) authentication uses a RequestAuthentication resource to validate tokens; AuthorizationPolicy then matches from.source.requestPrincipals (e.g. ["*"] to require any valid JWT) and JWT claims via when conditions.
- Zero-trust / least-privilege is implemented as a default-deny baseline plus explicit allow rules per service identity, rather than relying on network position.
- istioctl proxy-config secret <pod>.<namespace> inspects the certificates a proxy currently holds, useful for debugging mTLS handshake and identity issues.
- Multi-team RBAC uses namespace-scoped AuthorizationPolicies with Kubernetes RBAC that grants each team write access only within its own namespace, preventing cross-namespace policy tampering.
Domain 4: Observability
- Istio gives you the three observability signals out of the box without app instrumentation: metrics, distributed traces, and access logs, all emitted by the Envoy sidecars.
- Standard Istio request metrics (istio_requests_total, istio_request_duration_milliseconds, request size/response codes) are exposed by the proxies and scraped by Prometheus.
- The common observability add-on stack is Prometheus and Grafana for metrics, Jaeger or Zipkin for traces, and Kiali for the mesh topology graph.
- Kiali renders the service-graph topology with traffic health, request rates, and security (mTLS) status badges on each edge.
- Kiali derives its mTLS/security badges from Istio metrics in Prometheus (e.g. the connection_security_policy label); if those metrics or the Prometheus integration are missing, badges show as Unknown.
- Istio does NOT generate end-to-end traces by itself - applications must propagate the trace-context headers (B3 headers or W3C traceparent) from inbound to outbound requests, or traces break into disconnected single-hop spans.
- Trace sampling is head-based and propagated: the root/ingress proxy makes the sampling decision and downstream hops honor it, rather than each hop deciding independently.
- The Telemetry API (telemetry.istio.io/v1) is the recommended way to configure metrics, tracing (including sampling and custom tags), and access logging for the whole mesh or per-namespace.
- A namespace-scoped Telemetry resource overrides the mesh default - for example disabling access logging (or pointing it at a no-output provider) for just that namespace.
- To control metric cardinality, use the Telemetry API / metric tag customization to drop or aggregate high-cardinality dimensions while keeping standard request, duration, and response-code metrics.
- Custom metric dimensions can only be populated from attributes or headers actually available to Envoy at emit time; absent or non-propagated headers yield empty tag values.
- Per-pod tracing sampling can be set with the proxy.istio.io/config annotation containing tracing.sampling in its JSON value.
- istioctl x describe pod <pod> summarizes a pod's mesh config (mTLS, routes, policies), while istioctl proxy-config (clusters/listeners/routes/endpoints) dumps the proxy's live Envoy configuration for debugging.
- If a proxy has not acknowledged the latest CDS (cluster) config from istiod, it indicates a config push/sync problem; istioctl proxy-status shows SYNCED vs STALE/NOT SENT state per proxy.
Istio Certified Associate (ICA) exam tips
- The exam is hands-on and command-heavy: practice istioctl (install, kube-inject, analyze, proxy-config, proxy-status, x describe) and kubectl until applying and debugging YAML is muscle memory under time pressure.
- Memorize which resource owns which job: VirtualService = routing/retries/timeouts/fault/mirror; DestinationRule = subsets/load-balancing/connection-pools/outlier-detection; Gateway = edge ports/TLS; ServiceEntry = external hosts.
- Know the security pairing cold: PeerAuthentication controls mTLS mode (STRICT/PERMISSIVE/DISABLE), RequestAuthentication validates JWTs, and AuthorizationPolicy enforces allow/deny on identities and request attributes.
- When a trace is broken or incomplete, suspect missing application-level propagation of B3/traceparent headers before blaming Istio config; sampling is head-based and propagated downstream.
- Use istioctl analyze to catch misconfigurations and istioctl proxy-status to spot proxies that are STALE/out of sync with istiod - these are fast wins for the troubleshooting questions.
Study guide FAQ
What is the difference between sidecar mode and ambient mode?
Sidecar mode injects an Envoy proxy into every pod to handle both L4 and L7 traffic. Ambient mode is sidecar-less: a per-node ztunnel provides L4 connectivity and mTLS for all pods, and you add per-namespace/per-service-account waypoint proxies only where L7 features (advanced routing, L7 authorization) are needed, reducing per-pod overhead.
Why didn't my pods get a sidecar after I labeled the namespace?
Labeling a namespace with istio-injection=enabled only affects pods created after the label is applied, because injection happens through a mutating webhook at pod-creation time. Existing pods are not modified retroactively - you must restart or redeploy them (e.g. kubectl rollout restart) so the webhook injects the proxy.
How do I do a canary or weighted release in Istio?
Define version subsets (e.g. v1 and v2) in a DestinationRule using labels, then create a VirtualService with multiple http.route entries each carrying a weight (e.g. 95 to the v1 subset and 5 to the v2 subset, summing to 100). Gradually increase the v2 weight as it proves healthy, then shift 100% to v2.
What is the difference between PERMISSIVE and STRICT mTLS?
PERMISSIVE (the default) lets a workload accept both plaintext and mTLS traffic, which is essential during migration so non-injected or legacy clients keep working. STRICT accepts only mutually authenticated TLS, rejecting all plaintext. The common path is to run PERMISSIVE mesh-wide while onboarding, then tighten to STRICT once every client is in the mesh.