preloader
blog post

BROCS: The OS for Enterprise AI

author image

Every enterprise we work with at Calliope has the same five things they need to do with their AI apps and agents. Build the apps and agents. Run them somewhere safe. Observe what they do. Control what they are allowed to do. Secure the whole stack to a standard auditors will accept.

Five stages in the lifecycle of every enterprise AI app or agent, in roughly the order they happen. BROCS™.

Our founder Leo Mata writes about the same framework in personal voice on his blog , with the conversations and concrete failure modes that shaped it. This is the company version: same architecture, more product detail.

This is the framework we use internally to organize what Calliope is, and externally to talk to customers about where their current tooling stops and where the gaps are. It is also the structural argument for why an enterprise-AI stack has to be composable, deployable inside the perimeter, and built like an operating system rather than a constellation of point tools.

The five stages

Build

Somewhere the technical team makes things with AI. IDEs, notebooks, app builders, agent builders, document automation, data tools, deep research, chat, terminal. Five years ago this was one product (the IDE) for one audience (developers). Today it is a portfolio for the whole technical organization, and increasingly for the rest of the organization too.

The pattern we see most often when Build is missing in a sanctioned form is the predictable one. The team needs to ship. IT is somewhere in the middle of a Copilot pilot. The work cannot wait. So people sign up for personal accounts, prototype quickly, ship useful things, and create the kind of shadow exposure the security team will eventually spend a quarter cleaning up. An organization without a Build layer does not slow down. It routes around itself.

Run

The apps and agents have to execute somewhere. Agentic AI complicates this beyond a normal application runtime. Agents hold state, call into internal systems, run multi-step workflows, fail in ways that look nothing like a 500. Whatever they execute on has to absorb all of that inside the customer’s perimeter, against the customer’s data.

Most of the current AI tooling market has skipped this stage on the assumption that the customer’s existing infrastructure will absorb whatever workload the AI tools produce. That works for a demo. Production breaks it. The first time a real agent has to write back into a real system inside an enterprise, the runtime question, the identity question, the network isolation question, and the failure-mode question all show up at once, and the vendor who only shipped the IDE has nothing useful to say.

Observe

Once an app or agent is running, you need to see what it is doing. Logs and metrics from the runtime are table stakes. On top of that, the agent emits its own event stream of calls made, responses received, and decisions reached, and the audit chain over all of it has to hold up against tampering.

This is the stage that gets skipped most often, and the one that bites first when something goes wrong. An enterprise that cannot explain in writing what its AI did last Tuesday will be blocked from deploying it. The auditor blocks, then the CISO blocks, then the customer blocks. The audit chain is the price of admission to enterprise AI from here on out.

Control

Now that you can see, you have to act. Policy enforcement, role-based and attribute-based access, compliance frameworks, incident handling. Control is the closed loop where the platform acts on what Observe surfaces. It sits in the request path between the agent and the LLM, between the user and the data, between the policy on paper and the outcome that lands in production. Anything that lives outside the request path is observability with a fancier name.

Secure

The structural posture around the apps and agents you have put into production. SOC 2, GDPR, HIPAA, EU AI Act, NIST AI RMF. The compliance pack the CISO hands the auditor. The risk register that survives counsel review. The incident response process the board can attest to.

Not every organization carries the same compliance weight, and not every conversation needs the S. For a healthy chunk of customers, BROC is enough. For a CISO conversation, or any sale into a regulated industry, the S has to be present from the beginning.

Why these five and no other set

The five stages map to how the customer thinks about the work, not to how AI products are built and sold internally. Nobody asks themselves whether they have a training pipeline. They ask whether, if they let this thing run inside their company, they will be able to see what it is doing, control it, and prove it to an auditor. The framework that wins adoption is the one that matches the operator’s mental model.

Kubernetes versus OpenStack is the canonical version of this. Kubernetes mapped to how operators think about their work (containers, services, deployments). OpenStack mapped to how vendors organized their product lines (compute, storage, networking). The framework that earned the operator’s language eventually ate the one that did not. Most AI tooling vendors today are in OpenStack mode, pitching their product surface instead of the operator’s workflow.

What happens with a stage missing

Two recent public incidents at McDonald’s make two of the failure modes concrete.

In mid-2025, security researchers Ian Carroll and Sam Curry reported that McHire, McDonald’s recruitment platform built on Paradox AI’s “Olivia” chatbot, had an admin login protected by the default password “123456” and an insecure direct object reference in an internal API. Personally identifiable information and chat histories for roughly 64 million job applicants were accessible. Paradox patched it within a day of disclosure. That is a Secure failure. A password of “123456” on the admin path of a recruitment chatbot is not a finding from a deep audit. It is the first thing any structured SOC 2 readiness review would have caught.

Separately, McDonald’s customer-support chatbot was prompted with a technical request and obliged, producing output entirely outside its intended scope. The bot did not go rogue. It did exactly what it was structurally capable of doing. That is a Control failure: guardrails added after deployment cannot constrain an agent that was never scoped at architecture time.

The other failure modes follow the same pattern. Without a Build layer, the technical organization at large never gets moving on AI apps and agents; everything stays a side project for a few senior people. Without Run, the pilots are stuck in notebooks and die on the way to production. Without Observe, nothing earns the trust to deploy in the first place.

The cost of multi-vendor delegation

There is a second cost beyond integration time. When the customer delegates Build, Run, or Secure to a third party, that vendor’s bad days and bad decisions become the customer’s. Three recent examples, all in the last 60 days.

Vercel’s April 2026 security incident started at a third-party AI tool used by one of their employees, propagated through Google Workspace into their internal systems, and ended with API keys and tokens exposed for a subset of Vercel customers. Lovable.dev published their own incident response for the downstream effect on their platform, because Lovable customers were affected through infrastructure they did not directly select. A month later, Railway suffered an eight-hour platform-wide outage after Google Cloud’s automation incorrectly suspended their production account. Railway took full responsibility for the architectural dependency. The trigger was a decision outside their control.

None of these vendors is incompetent. The pattern accelerates as more of the AI tooling stack depends on more upstream services, not the other way around. When the layer that matters lives outside a perimeter the customer actually controls, every blast radius upstream gets absorbed downstream.

The architectural shape

The five stages are independent and composable. Each one is useful on its own. The value compounds when they tile cleanly across the lifecycle.

The architecture that wins ships the full lifecycle as a composable system, designed to deploy inside the customer’s perimeter, with each stage useful standalone and each one stronger when used with the others. The future of enterprise AI is hybrid. Apps and agents will run across cloud accounts, on-prem datacenters, and edge environments inside the same organization, and the platform has to absorb that natively rather than ship a separate product per environment.

Structurally, that makes the platform an operating system. A foundation that apps, agents, and governance all run on. The OS underneath the AI surface.

Short version: own your data, own your cloud, simplify how you operate inside both. Anything else triples the overhead managing five different tools and inherits every upstream incident along the way.

Where Calliope fits

Calliope is built around BROCS. Three products tile the lifecycle:

  • Workbench is the Build stage. A browser-based work surface for the whole technical organization (developers, analysts, business users) with the tool stack that lets them ship apps and agents inside the customer’s perimeter.
  • Astrolift is the Run stage and half of Observe. Multi-cloud and on-prem Kubernetes runtime. Same manifest against AWS, GCP, Azure, vanilla Kubernetes, or bare metal. MIT-licensed.
  • Zentinelle is Control, the other half of Observe, and the Secure structural posture. Agent GRC, MIT-licensed. Twenty-four policy evaluators, twenty-four backend integrations, five compliance frameworks shipping (SOC 2, GDPR, HIPAA, EU AI Act, NIST AI RMF).

Each product is useful on its own. Together they form the full BROCS surface inside the customer’s perimeter. The customer owns their data, owns their cloud, and operates inside both with a single deployable platform instead of three to five vendors with three to five contracts.

If you are building, deploying, or buying anything in enterprise AI right now, walk through the five stages in your design and check that each one is present and that they compose. If a stage is missing, you will know which one inside six months of deployment.

Talk to us at calliope.ai .


BROCS is a framework introduced by Leo Mata , founder of Calliope Labs, May 2026. Released under CC BY 4.0 . Free to use, cite, build on, and adapt with attribution. BROCS™ is claimed as an unregistered trademark by Leo Mata.

Related Articles