Regulated-industry security ISV

Customer support agent for an identity-and-compliance ISV

  • Azure AI Foundry
  • Azure AI Search
  • Hybrid search
  • Microsoft Agent Framework
  • .NET
  • Bicep
  • Azure Functions

Challenge

An identity-and-compliance ISV serving regulated-industry customers had a long-tail support problem: a small population of premium-tier customers were filing deep, multi-day tickets on advanced cryptography and encryption workloads. The questions were always answerable from the product documentation; the documentation was just too dense and too scattered across PDFs, chapters, and code samples for a human triage agent to pattern-match quickly.

Approach

Built an agentic retrieval assistant grounded in the product's own technical documentation. PDFs were parsed into section-level markdown, chunked at subheading granularity, and indexed in Azure AI Search as a hybrid index (BM25 keyword score reciprocally-rank-fused with 1536-dimension vector embeddings). The agent runs on the Microsoft Agent Framework with Azure AI Foundry, using two tools: one ranks which books and chapters are relevant to a question, the second searches inside the chosen books and returns minimal chunks for citation. All service-to-service auth is managed identity (zero secrets); storage runs over private endpoints; the agent streams responses via Server-Sent Events with tool-call transparency so the support team can see which sources grounded each answer.

Result

The reference architecture and the code that runs the agent were adopted by the ISV's customer support team and lifted into their production support portal for the premium-tier customer base. The agent now answers in minutes the kind of question that previously turned into a multi-day documentation deep-dive for a senior support engineer.

Adopted into production by the ISV's customer support organization; serving the premium-tier customer base for deep-cryptography questions

Why this matters

Premium-tier support for a regulated-industry product is one of the hardest places to put an LLM. The wrong answer in this domain is not just embarrassing; it can ship a non-compliant cryptographic configuration into production at a bank or a defense contractor. Hallucination is non-negotiable: the agent has to be unable to invent an API, an algorithm name, or a parameter that does not exist in the documentation.

That constraint shaped every architectural choice.

Grounding before generation

The agent is forbidden from answering from model knowledge alone. Every conversation turn calls FindRelevantSources first, which runs a hybrid (keyword plus vector) search across the entire documentation corpus and returns the top books and chapters by relevance. Only then does the agent call SearchByBook, which returns minimal content chunks (content, chapter, heading) filtered to the chosen scope. The final response is composed from those chunks with explicit citation back to the source book and chapter.

Why two tools, not one? A single “search everything” tool produced answers that confidently mixed material from the wrong book. Forcing the agent to pick a book first dropped that failure mode to near zero, at the cost of one extra tool call per turn. Worth it.

Hybrid retrieval, not just vector

Cryptography documentation is full of acronyms, parameter names, and algorithm identifiers (“AES-GCM”, “X9.42”, “RSA-OAEP-256”) that vector search alone tends to under-weight in favor of semantically-related but lexically-different chunks. Hybrid retrieval (BM25 plus 1536-dim embeddings, fused with reciprocal rank fusion) was the difference between “this answer cites the right algorithm” and “this answer cites a vaguely-similar algorithm.” Vector-only retrieval consistently lost on questions where the user already knew the exact term they were looking for, which describes most premium-tier support questions.

Production-grade plumbing

The bits that don’t show up in a demo but matter for handing the code to a customer support team:

  • Zero secrets. Service-to-service auth is Azure managed identity with RBAC. No connection strings, no keys to rotate, no KeyVault to misconfigure.
  • Private networking. The Azure Storage account that holds the function’s runtime state has publicNetworkAccess: Disabled. The function communicates with storage exclusively through private endpoints over a VNet. The search and AI Foundry endpoints remain public (they’re the API contract) but everything else is off the internet.
  • Streaming with tool transparency. Responses stream as Server-Sent Events with explicit tool_start and tool_end events. The support agent UI shows which book and chapter grounded each answer, so a human can verify the citation in one click before passing the response to a customer.
  • Singleton agent in-process. The agent runs inside the Azure Function process as a singleton, created on first request and reused across sessions. Cold-start cost is paid once; subsequent requests skip framework initialization. A future iteration moves the agent to Microsoft’s hosted Foundry Agent Service for per-conversation identity and portal visibility.

The handoff

The most useful thing about this engagement was not the demo. It was the fact that the reference architecture, the Bicep, the agent code, the search indexer, and the operational playbook were designed from day one to be picked up and run by someone other than me. When the customer support team adopted the code into their production portal, the lift was small enough to be a straightforward platform-team integration; no months-long re-architecture.

That handoff was the goal. The demo was just the proof.

All case studies · Discuss a similar engagement