Caching Strategy (Planned)¶
Layers (proposed):
- Client-side response memoization (short-lived)
- In-process LRU (queries → doc ids)
- Vector index (semantic retrieval)
- Optional Redis for cross-process embedding & generation cache
- CDN edge (static assets, future)
Metrics: hit ratio per layer; stale serve count; invalidation latency.
Open Questions:
- Cache key normalization for multilingual content?
- Embedding model change invalidation procedure?