
Coding Agent Swarms, Part 5: Running the Fleet From Your Phone
The Last Mile Is the Operator The first four parts of this series built the substrate: foundation, fleet, multi-fleet …

Large language models hallucinate. They generate plausible-sounding information that’s completely fabricated. This isn’t a bug to be fixed—it’s a fundamental characteristic of how these models work.
Building reliable AI systems means designing for hallucinations, not wishing them away.
Language models predict the most likely next token based on patterns in training data. They don’t “know” things—they generate statistically plausible text.
When asked something they don’t have good training data for, they generate plausible-sounding text anyway. That’s hallucination.
Common hallucination scenarios:
Strategy 1: Retrieval-Augmented Generation (RAG)
Don’t ask AI to remember—give it the information.
Instead of: “What’s our refund policy?” Use: “Based on this document [policy.pdf], what’s our refund policy?”
RAG grounds responses in actual documents, dramatically reducing hallucinations about your specific data.
Strategy 2: Ask for Citations
Make the AI cite its sources:
“Answer this question and cite the specific section of the document where you found the information.”
If it can’t cite a source, it’s probably hallucinating.
Strategy 3: Constrain the Output
Give the AI explicit options:
“Based on this data, is the trend UP, DOWN, or FLAT? Choose only from these options.”
Constrained outputs are harder to hallucinate.
Strategy 4: Verification Loops
Have AI verify its own claims:
“You just stated X. Where in the provided documents is this supported? If you can’t find support, revise your answer.”
Self-verification catches many hallucinations.
Strategy 5: Lower Temperature
For factual tasks, reduce randomness:
Use low temperature for factual queries, higher for creative tasks.
UI that encourages verification:
Workflows that include checks:
Data architecture that supports RAG:
Low tolerance (be careful):
For these: RAG, citations, human review, constrained outputs.
Medium tolerance:
For these: RAG, sampling review, source linking.
Higher tolerance:
For these: Creativity matters more than precision.
A well-calibrated AI should admit uncertainty:
“If you’re not confident about the answer based on the provided documents, say ‘I don’t have enough information to answer this confidently.’”
An AI that says “I don’t know” is more trustworthy than one that always answers confidently.
Track hallucination rates in production:
User feedback: Did users flag incorrect information? Verification checks: Did automated checks find unsupported claims? Citation validity: Do cited sources actually support the claims? Consistency: Do repeated queries give consistent answers?
Hallucination monitoring is ongoing, not one-time.
When building AI systems:
Hallucinations happen. Handle them by design.

The Last Mile Is the Operator The first four parts of this series built the substrate: foundation, fleet, multi-fleet …

A Short Story About Why the Stack Has the Shape It Does Every platform has an origin story. Most of them are forgotten …