The Four Dimensions of Tenant Isolation in AI Systems
Data isolation
Each tenant's source documents, extracted data, and stored outputs must be isolated at the storage layer. Vector databases must be configured so that retrieval for Tenant A never returns chunks from Tenant B's documents. This requires tenant-scoped namespaces or collections, not just metadata filtering on a shared collection — metadata filtering can be bypassed; namespace isolation cannot.
Context isolation
LLM context windows must never carry one tenant's data into another tenant's request. This means: no shared conversation state across tenants, per-tenant system prompts (not shared prompts that reference multiple tenants), and session isolation at the application layer that prevents accidental context reuse.
Cache isolation
Semantic caches must be scoped per tenant. A cached response for Tenant A's question about their refund policy should never be served to Tenant B, even if the question is semantically identical. Cache keys must include a tenant identifier, and cache stores should be partitioned by tenant for compliance-sensitive use cases.
Cost isolation
Cost attribution per tenant enables: per-tenant billing, identification of high-cost tenants who need optimisation or pricing adjustment, and fair rate limiting that prevents one tenant's usage spike from degrading other tenants' performance.