Local Kubernetes Dev — Part 1: The inner dev loop and why run a cluster locally
Why the Kubernetes dev loop is so slow, why code that works locally breaks in the cluster — and what we will build across this series to fix both.
Picture this: you've written a service, it works great on your laptop, and now you need to ship it to Kubernetes. It feels like there's almost nothing left to do — build a container and hit "deploy." But this is exactly where the real pain begins: the development loop that was instant a moment ago suddenly stretches into minutes, and code that "worked locally" falls over in the cluster for reasons you never even thought about before.
This chapter is about why that happens and what we're going to do about it. Let's agree right away on a running example that will follow us through the whole series.
Our hero — the `myapp` service:
- an HTTP API on port
8080; - written in Python 3.12 + FastAPI, launched via
uvicorn; - in development mode it uses hot reload:
uvicorn --reload; - depends on PostgreSQL.
Locally we'll spin up a cluster with k3d (a lightweight Kubernetes that runs inside Docker), we'll name the cluster dev, the workspace (namespace) will be myapp, and we'll store images in k3d's built-in registry at k3d-registry.localhost:5000. Don't worry about the unfamiliar terms — we'll cover every one of them as it comes up.
What the inner dev loop is
The inner dev loop is the loop a single developer cranks through on their own machine over and over while writing code: change code → build → run/restart → check the result → fix. It happens before you share your changes with the team (via commit and push). The Telepresence docs put it this way: "a single developer should be able to set up and use the inner loop to quickly write and test changes," and most importantly — "the faster the feedback loop, the faster a developer can refactor and test again."
The key phrase here is feedback speed. When you develop myapp locally without any Kubernetes at all, the loop looks roughly like this:
- you change a function in a Python file;
uvicorn --reloadpicks up the change on its own in a fraction of a second;- you open a browser or hit
curl localhost:8080/...; - you see the result.
The whole loop takes seconds. You can repeat it dozens of times an hour, barely breaking your train of thought about the task at hand. That's a healthy inner loop.
It's important not to confuse it with the outer loop. The outer loop begins after git push: the CI pipeline, building artifacts, delivery into the cluster via GitOps tools like ArgoCD or Helm, running integration tests on a shared environment. This series is almost entirely about the inner loop: how to make it fast and, at the same time, similar to prod.
Why the loop gets slow in Kubernetes
The moment we decide to run myapp not just as a process but as a Pod (a Pod is the smallest deployable unit in Kubernetes, a wrapper around one or more containers), a whole chain of steps wedges itself in between "I changed the code" and "I see the result." The Telepresence docs list four additional steps that didn't exist when you ran things locally:
- package the code into a container (
docker build); - write/update a Kubernetes manifest;
- push the image to a registry (
docker push); - deploy the container to the cluster and wait for the Pod to come up.
Now the loop for myapp looks like this:
1# 1. build the image
2docker build -t k3d-registry.localhost:5000/myapp:dev .
3
4# 2. push to the registry
5docker push k3d-registry.localhost:5000/myapp:dev
6
7# 3. apply the manifests
8kubectl apply -f k8s/
9
10# 4. wait for the new Pod and check that it's alive
11kubectl rollout status deployment/myapp -n myapp
12kubectl logs -f deployment/myapp -n myappEach pass like this is, by oneuptime's estimate, on the order of 2–5 minutes per iteration: editing the code, building the image, pushing to the registry, updating the manifests, waiting for the Pod to restart, checking. Compare that with the seconds it takes with uvicorn --reload — the difference is orders of magnitude. The same source notes that syncing files straight into a running Pod with hot reload brings the iteration back down to 1–5 seconds, that is, it cuts the loop time by 95% or more. It's precisely for this speedup that tools like Tilt exist, and we'll devote a whole chapter to them.
There's also a separate, very insidious source of friction: the local cluster can't see images from your local Docker daemon. It seems logical that since you just built myapp:dev with docker build, the cluster would pick it up right away — but that's not the case, and you have to either push the image to a registry the cluster can reach or load it there with a special command. Exactly how this is done in k3d is covered in detail in the chapter on containerization. If you skip the delivery step, you'll get a Pod stuck in ImagePullBackOff — the kubelet endlessly tries to pull an image that isn't in the registry.
The :latest tag trap
If you specify the image in the manifest as myapp:latest, Kubernetes will by default set imagePullPolicy: Always — the kubelet will pull the image on every launch, even if it's already there locally. Use specific tags (myapp:dev, myapp:abc123) and, if needed, imagePullPolicy: IfNotPresent.
The "works locally / breaks in the cluster" gap
A slow loop is only half the trouble. The other half is bugs that simply can't be seen on a laptop without a cluster. Why? Because the environment where you usually test your code (plain uvicorn or docker compose) doesn't reproduce important Kubernetes mechanisms. The Testkube team puts it bluntly: "CI containers don't have the resource limits, network policies, RBAC, or service mesh rules that exist in production." Hence a whole class of bugs visible only in the cluster:
- OOMKilled — the Pod is killed for exceeding its memory limit. Locally memory is "infinite," and
myapphappily eats as much as it wants; in the cluster a limit is set, and the Pod dies. - CPU throttling. If you set a CPU limit and the application exceeds it, Kubernetes aggressively throttles the container — hence latency spikes and unexpected health-check failures.
- Network policy blocks. In prod, traffic between Pods may be restricted; locally nothing restricts it, and
myapptalks to PostgreSQL with no problem — right up until the rollout. - Readiness probe failures — Kubernetes considers the Pod not ready and doesn't route traffic to it, while you rack your brains over why the service "is there but doesn't respond."
- RBAC: access denied — locally authorization usually allows everything, but in prod strict RBAC is enabled, and a service that talked to the Kubernetes API just fine locally gets denied right after deployment.
The crux of the problem is one and the same: the more your test environment differs from prod, the more bugs "leak" further down the pipeline, where they're many times more expensive to find and fix.
The shift-left idea: catch environment problems as early as possible
Shift-left is a principle: catch problems as early as possible, closer to the start of the development pipeline rather than at the end. It sounds like a truism, but for Kubernetes it has a concrete meaning. It's not just about testing earlier — it's about which environment you test in. Testkube nails it: "Testing earlier in a CI container that doesn't match your cluster isn't shift-left. It's just failing faster in the wrong environment."
The key concept here is environment fidelity. True shift-left for Kubernetes means early validation against real cluster conditions: real resource limits (not an "unlimited" CI container), real network policies, real services in the namespace (not mocks), and verifying that the manifests actually produce healthy Pods at all. This series lives at the level of a lightweight local cluster (kind, k3s/k3d) — real Kubernetes, even if it isn't fully at parity with prod.
Speaking of healthy Pods. The official Kubernetes blog counts skipping resource requests/limits and underestimating health probes among the common pitfalls — without probes "Kubernetes thinks the workload is running even if the application inside isn't responding." We cover the distinction between liveness and readiness probes canonically in the chapter on getting closer to prod; how to inspect a probe's status while debugging is in the chapter on observability.
What we'll build by the end of the series
Let's put it all together. We have two problems: the development loop in Kubernetes is slow (minutes instead of seconds), and there's a gap between local and prod (bugs visible only in the cluster). The solution is to assemble a local environment that is both fast and production-like. By the end you'll have a working setup made of four parts:
- A local cluster on k3d — real Kubernetes on your machine, not an imitation of it. We'll spin it up in the k3d chapter.
- Containerization and real manifests. A Dockerfile for
myappand Kubernetes manifests — the same primitives (Deployment, Service, probes, resource limits) as in prod, notdocker-composeas a substitute for a cluster. - A fast-loop tool — Tilt with live-update, which syncs your code into a running Pod and brings the iteration back to seconds.
- Parity with prod — dependencies like PostgreSQL inside the cluster, proper handling of configs and secrets, networking and Ingress, plus health probes, resource requests/limits, and other techniques that make your local setup truly production-like.
In other words, we'll put shift-left into practice: we'll catch environment problems locally, before the code ever heads off to the shared cluster. And to understand which tool is responsible for what and why we chose k3d and Tilt specifically, let's start with the next chapter — about what a "production-like environment" even means.
Sources
- The developer experience and the inner dev loop — Telepresence docs
- Inner Development Loop Optimization with File Sync to Kubernetes Pods — oneuptime
- Inner Loop and Outer Loop — Stakater KubeStack+ docs
- What Shift-Left Testing Means for Cloud-Native Teams — Testkube Blog
- 7 Common Kubernetes Pitfalls (and How I Learned to Avoid Them) — kubernetes.io blog