agentforge
agentforge is a small Express + TypeScript HTTP service that wraps the Anthropic SDK behind a clean REST/SSE interface. The endpoint of interest is POST /v1/chat: the server runs an agent loop over the model with tools, streams content_block_delta events out as Server-Sent Events, executes any tool_use blocks server-side, feeds results back as the next user turn, and stops when the model emits end_turn or hits the max-iteration cap.
The loop itself is implemented as an async generator that yields a typed AgentEvent (iteration_start, text_delta, tool_call, tool_result, usage, done), so the route layer is a thin pump from the generator to the SSE writer. The same generator drives the eval harness, which reuses it without HTTP. Tools are described in a tiny registry (fetch_url with private-IP guards, search_docs over a local fixture set), each carrying its own Zod input schema; invalid model output gets a structured error fed back to the model rather than a thrown exception, so Claude self-corrects on the next turn.
Prompt caching is wired in via cache_control: { type: 'ephemeral' } on the system prompt and the trailing tool definition. The cache hit/miss numbers are exposed at /metrics so cache hit rate is graphable, not aspirational. Conversation history is persisted in Postgres with raw SQL migrations (no ORM, mirroring tinybus). The whole service runs on a multi-stage distroless Docker image, deploys to Railway with a one-line railway.json, and ships with vitest + supertest integration tests for the env loader, the tool registry, and the HTTP surface.
Three design decisions are deliberately left as TODO blocks in the code with the trade-offs explained inline: sequential vs parallel tool execution, prompt-cache breakpoint strategy, and which agent-loop counters to expose. Those are the choices that depend on actual production traffic, not on what reads well in a tutorial.