Running one local LLM node is easy. Running a fleet of them, off-cluster, and trusting it to stay current and stay honest, is the hard part. This is the work that got LLMKube there: declarative, health-gated self-update for off-cluster agents (helm and brew for the edge), liveness and admission validation so dead or malformed nodes cannot lie to the control plane, and a real end-to-end test. Plus the build-in-public part: the bugs our own dogfooding caught that the unit tests could not, including a self-update path that had quietly disabled itself in production.
Making a fleet of self-hosted LLM agents trustworthy