Every single AI integration feels under-engineered (or not even engineered in case of tokenslop), as the creators put exactly the same amount of thought that $LLMOFTHEWEEK did into vomiting "You're absolutely right, $TOOL is a great solution for solving your issue!"
We're yet to genuinely standardise bloody help texts for basic commands (Does -h set the hostname, or does it print the help text? Or is it -H? Does --help exist?). Writing man-pages seems like a lost art at this point, everyone points to $WEBSITE/docs (which contains, as you guessed, LLM slopdocs).
We're gonna end up seeing the same loops of "Modern standard for AI" -> "Standard for AI" -> "Not even a standard" -> "Thing of the past" because all of it is fundamentally wrong to an extent. LLMs are purely textual in context, while network protocols are more intricate by pure nature. An LLM will always and always end up overspeccing a /api/v1/ping endpoint while ICMP ping can do that within bits. Text-based engineering, while visible (in the sense that a tech-illiterate person will find it easy to interpret), will always end up forming abstractions over core - you'll end up with a shaky pyramid that collapses the moment your $LLM model changes encodings.
I follow a similar pattern. My autonomous agent Smith has a service mesh that I plug MCPs into, which gives me a single place to define policy (OPA for life) and monitoring. The service gateway own credentials. This pattern is secure, easy to manage and lets you can programmatically generate a CLI from the tool catalog. https://github.com/sibyllinesoft/smith-gateway if you want to understand the model and how to implement it yourself.
The boundary also needs to hold if the agent is compromised. Proxying keys is the right instinct. We took the same approach at the action layer: cryptographic warrants scoped to the task, delegation-aware, verified at the MCP tool boundary before execution. Open source core. https://github.com/tenuo-ai/tenuo
This resonates. The pattern I keep seeing is that the best AI tooling right now is about constraining the agent, not giving it more freedom. MCP gives you a clean boundary between what the AI decides and what the system executes deterministically. I use MCP servers with Claude Code and the biggest win is exactly what you described, the AI handles the creative problem solving but the actual actions go through predictable, auditable paths.
It seems like we're going back to expert systems in a kind of inverted sense with all of this chaining of deterministic steps. But now the "experts" are specialized and well-defined actions available to something smart enough to compose them to create new, more powerful actions. We've moved the determinism to the right spot, maybe? Just a half-thought.
I'm just trying to learn this stuff now, so I don't the literature. The "trajectory view" through action space is what makes the most sense to me.
Along these lines, another half-baked pattern I see is kind of a time-lagged translation of stuff from modern stat mech to deep learning/"AI". First it was energy based systems and the complex energy landscape view, a-la spin glasses and boltzmann machines. The "equilibrium" state-space view, concerned with memory and pattern storage/retrieval. Hinton, amit, hopfield, mackay and co.
Now, the trajectory view that started in the 90s with jarzynski and crooks and really bloomed in 2010+ with "stochastic thermodynamics" seems to be a useful lens. The agent stuff is very "nonequilibrium"/ "active"-system coded, in the thermo sense... With the ability to create, modify, and exploit resources (tools/memory) on the fly, there's deep history and path dependence. I see ideas from recent wolpert and co.(Susanne still, crooks again, etc.) w.r.t. thermodynamics of computation providing a kind of through line, all trajectory based. That's all very vague I know, but I recently read the COALA paper and was very enchanted and have been trying to combine what I actually know with this new foreign agent stuff.
It's also very interesting to me how the Italian stat mech school, the parisi family, have continuously put out bangers trying to actually explain machine learning and deep learning success.
I'd love to hear if anyone is thinking along similar lines, or thinks I'm way off track, has paper recs please let me know! Especially papers on the trajectory view of agents.
I have wondered if we're going to end up investing so much in putting up guard rails around AI that we end up with systems of the same complexity as a non AI expert system that runs slower and at higher costs due just having injected models and tokens into the mix! I joke, but it seems like there's a pull towards that.
I think we need to just think of agents as people. The same principles around how we authenticate, authorize and revoke permissions to people should apply to agents. We don't leave the server room door open for users to type commands into physical machines for good reason, and so we shouldn't be doing the same with agents, unless fully sandboxed or the blast radius of malign or erroneous action is fully accepted.