Content Quality Principles for AI Knowledge Bases
- Atomicity: each document or chunk should cover one topic completely — avoid documents that mix multiple unrelated topics, as retrieval systems return whole chunks
- Consistency: use consistent terminology throughout — if the same concept is referred to by three different names in different documents, retrieval will miss two-thirds of relevant content for queries using any single term
- Completeness: document what happens in edge cases and exceptions, not just the happy path — AI retrieves what exists; gaps in coverage produce gaps in AI responses
- Currency: outdated information is worse than no information — a knowledge base that is 30% out of date produces outputs that are confidently wrong on those topics
- Specificity: write for the specific questions users ask, not for the generic topics that seem important — "how do I request a refund for a subscription?" is more useful than "Refund Policy Overview"