#012: Building my own coding agent: Human-in-the-loop

5 Jan 2026

One of the first things I wanted to do after setting up the foundations for my coding agent, agx, was to add human-in-the-loop (HITL) controls to it. These controls allow the user to:

Require explicit approval before running tool calls (creating/editing files, running bash commands, etc.)
Reject tool calls (with optional feedback to alter the approach taken)
Interrupt the conversation at any time

Controls like these are non-negotiable for any coding agent that can perform destructive actions on the user’s machine. Without them, a single hallucinated rm -rf or an overzealous file edit could cause real damage. However, implementing HITL controls adds complexity to the agentic loop — state management becomes trickier, and edge cases multiply. This post describes how I added such controls to agx and how it led to a fundamental change in its architecture.

A basic prototype

As described in the last post, I’m using the rig crate to abstract away the low level details of interacting with LLM provider APIs. The first implementation of agx relied on rig’s “multi-turn” functionality for running the agentic loop, which takes care of calling tools and sending API requests to provider APIs.

I started by adding basic controls: approve/reject tool calls by typing <enter>/n, alongside the ability to provide feedback to instruct the LLM to take a different approach. I also added support for interrupting the conversation when the user presses Ctrl+c, which involved properly handling any in-progress tool call invocations.

rig exposes a “hooks” mechanism where certain functions can be called before a tool call is fired. Getting a prototype of the HITL controls with this mechanism wasn’t too difficult, but it felt architecturally wrong. The hook mechanism introduced indirection that made state harder to track, and forced me to work with raw JSON values instead of the concrete types I’d defined for the tools. If I needed to show additional context for a tool call (eg, a code diff), I had to parse and validate it, something that rig’s internal mechanism would be doing again after the tool call was approved. It felt like I was fighting the abstraction rather than using it.

The HITL prototype worked, but would sometimes cause issues when a tool call was rejected or the conversation was interrupted. Given how large the payloads sent to LLMs can become, relying on tracing to debug this became tedious quickly. So, I decided to take a tangent — something I am guilty of doing quite a lot — to solve this issue: add a debug UI to agx that would clearly show me its internal state at each turn of the conversation.

Opening up the Black Box

The debug UI is written in Gleam, using the Elm-inspired framework Lustre. It’s powered by a debug server that runs concurrently alongside the agent. As various events occur in the conversation, this server forwards them to the UI using SSE. The UI renders these events in a vertical layout, with each event getting a distinct color. The UI also includes a minimap to help understand the agentic loop from a higher vantage point, and to allow for easy jumping to a specific event.

This debug UI helped a lot in understanding the lower level details of the abstractions provided by rig. It also helped me understand why the HITL prototype was failing on rejections and interruptions: I wasn’t always managing the chat history correctly when these happened. Most LLM provider APIs mandate that each tool call must be associated with a tool result, something I was failing to do in some edge cases.

Besides helping fix the HITL issues, it surfaced a problem I hadn’t noticed before: rig, at the time of writing, doesn’t include assistant text in the chat history when it precedes tool calls, which means the LLM loses context about what it said before in subsequent turns. This led me to discover that it manages its own internal history for a multi-turn session, one that diverged from the history I was maintaining manually.

Armed with the debug UI, I was able to properly manage the chat history in the case of rejections and interruptions. It worked but didn’t feel elegant in the multi-turn setup. This, alongside the fact that building HITL controls via the hook mechanism felt brittle, led me to decide that I should be managing turns manually.

Manual Control

Managing turns manually required me taking control over every part of the agentic loop:

Send prompt and stream the response
Capture assistant text, reasoning, and tool calls
Parse and validate tool calls
Pause for user approval, if required (HITL)
Execute tools (or handle rejection)
Update chat history and repeat

This change required a big refactor to agx, but opened up the door to several improvements:

HITL fits nicely in the agentic loop. I get manual control over parsing tool call requests, validating them, displaying context for them to the user, and executing them
Tool call rejections/cancellations are easier to handle
There’s one chat history to maintain

I’ve implemented a simple “approval system”, where the user can decide to approve certain tool calls for the session (like creating/editing files), and others for every invocation (like running bash commands). The user’s choice for the latter kind gets stored in the directory where agx is run, and can be picked up on further runs.

Having the debug UI available during this refactor was quite helpful — it allowed me to verify exactly what state was being tracked at various steps. I expect it to continue being helpful as I add more features.

I haven’t yet seen how feature-rich agent building toolkits enable HITL, but with a lower-level abstraction like rig, implementing it meant confronting the architecture of modern agentic loops directly. That’s exactly the kind of understanding I was hoping to gain by building agx in Rust.

Next up: support for configuring multiple agents, and maybe a more ergonomic UI.