The production half of agents finally shipped
Toolboxes and Memory in Microsoft Foundry are the production-side pieces that decide whether your agent makes it past a demo.
- #foundry
- #agents
- #devex
- #governance
Most of the noise out of Microsoft Build 2026 went to the big-name pieces: hosted agents, the optimizer, Foundry IQ. Fair enough, they are the parts that demo well. The pieces I think actually matter most for anyone shipping agents into production are the quieter ones. Both are in the Foundry Build 2026 recap, tucked into the “if you only have time to try one thing” table.
The first is Toolboxes in Foundry, now in public preview. The pitch is small and important: one governed endpoint that fronts your tools, skills, MCP clients, and enterprise data. Every agent project I have watched, mine included, eventually grows a sprawl of ad-hoc tool registrations, function wrappers, MCP servers spun up by different people, and “we’ll secure that next sprint” caveats. Toolboxes are an attempt to make the tool surface a first-class governed object instead of an emergent property of however many notebooks were open last quarter. That alone moves a lot of agent code from “demo” to “auditable.”
The second is Memory in Foundry Agent Service, also in public preview, with procedural, user, and session memory plus features like time-to-live. The interesting word in the post is “reliability.” Most teams treated memory as a personalization feature: remember my preferences, remember the project, remember the last thing we did. That matters, but procedural memory is more interesting for enterprise work because the failure mode is not always missing facts. It is skipping a validation step, using the wrong tool path, forgetting a policy check, or repeating the same broken pattern on a similar task.
For SI and ISV partners, this is the part that should get attention. Customers do not usually ask for a toolbox strategy or a memory lifecycle design by name. They ask why the agent worked in the pilot and got weird in production. They ask who can approve tools, how data access is governed, why the agent remembered something it should have forgotten, and whether the workflow can improve without becoming a black box.
That is the production half of agents: governed tools and reliable memory. Not glamorous, not optional.
If you are building partner solutions on Foundry, I would start here before the next demo polish pass. Inventory the tools your agents can call. Decide what belongs behind a governed toolbox. Then decide what your agents are allowed to remember, for how long, and why. The teams that answer those questions early will have a much easier time turning agent demos into customer deliverables.