Why I Automate Everything
A short note on treating infrastructure as a product.
There is a version of infrastructure work where you SSH into a server, make a change, and hope you remember what you did six months later when something breaks. I spent time in that world. It is not a sustainable place to operate.
The shift happened for me when I stopped thinking about infrastructure as a collection of servers to manage and started treating it as a product to be engineered. Products have requirements, specifications, and a source of truth. They are built deliberately. They are testable. They can be handed to someone else without a two-hour verbal handover.
Infrastructure-as-code is not a new idea, but the discipline it imposes is worth restating: when every meaningful state of your system exists as a text file in a git repository, you gain a history, a diff, a rollback path, and — most importantly — a conversation. You can open a pull request on your infrastructure and have someone else review it before it applies.
The Hidden Cost of Manual
Manual configuration work carries a debt that compounds quietly. The first time you configure a server by hand, it takes an hour. The second time, it takes forty-five minutes because you forgot a step. The fifth time, something goes wrong in production at 2am and you are staring at a terminal trying to reconstruct the exact sequence of commands that made things work before.
The operational cost of manual work is not just the time of doing the task. It is the cognitive load of remembering, the risk of drift between environments, and the bus factor of being the only person who knows what actually runs on that machine.
Automation eliminates the drift. When your staging environment is provisioned from the same declarative spec as production, the gap between “it works in staging” and “it broke in production” narrows significantly. You stop chasing environment-specific snowflakes and start building confidence in a reproducible system.
GitOps as an Operating Philosophy
I run my infrastructure on Talos Linux with ArgoCD as the GitOps controller. Every cluster resource — namespaces, deployments, ingress rules, RBAC policies, secrets manifests — lives in a git repository. ArgoCD watches that repository and reconciles the cluster state against it continuously. The cluster tells you when it drifts; you do not have to guess.
This model has a few practical consequences that become obvious once you live with it:
You stop running kubectl apply directly. The urge to “just patch this real quick” is strong, but doing so creates state that is not captured anywhere. If ArgoCD later syncs, your change disappears. If you do not document it, it disappears permanently. The discipline of going through git first feels like friction until the moment you need to reconstruct what happened — and then it feels essential.
Reviews become possible. When a configuration change is a pull request, it is reviewable. You can catch a missing resource limit before it reaches a cluster. You can spot a misconfigured environment variable before it causes a silent failure at runtime.
Audits become trivial. “What changed and when?” is a question that git answers completely. The audit trail is a side effect of the workflow, not something you have to build separately.
Automation as Respect for Your Future Self
I often frame automation decisions this way: am I willing to do this manually the next twenty times? If the answer is no, I should automate it now, while the context is fresh and the motivation is real.
This is not about finding clever solutions or optimising for its own sake. It is about respecting the operational reality that the next time this task comes up, you will have less context, more pressure, and competing priorities. The automation you write today is a gift to yourself at 11pm on a Sunday.
There is also a compounding effect. Each automation you put in place reduces the surface area of things that require human intervention. Over time, the ratio of your time spent on reactive firefighting to proactive engineering shifts. You get more space to improve the system rather than just maintain it.
What Automation Is Not
Automation does not make complexity disappear. A poorly designed system that is fully automated is still a poorly designed system — it just fails more consistently. The discipline of automation forces you to think clearly about what a process actually is, which often reveals design problems you were papering over with ad-hoc manual steps.
Automation also creates a maintenance surface. Scripts drift, APIs change, dependencies go stale. The investment only pays off when you treat automated workflows with the same care you apply to production code: version-controlled, documented, tested where possible, and reviewed before they change.
The goal is not automation for its own sake. The goal is a system that is legible, reproducible, and auditable — one where you, or anyone else, can understand what is running and why, without having to reconstruct it from memory. Automation is the most reliable path I have found to that outcome.