Performance & Latency
Gateco adds a policy evaluation layer between your AI agent and your vector database. This overhead is deterministic and bounded.
p95 Latency
25 ms overhead above the raw vector DB call, measured against a pgvector baseline with a 5-rule RBAC policy and 100 resources. Policy evaluation time is dominated by principal attribute lookup and AST traversal — both are O(rules) and cached across requests.
Caching Layers
Gateco applies three caching layers to reduce policy evaluation overhead on repeated requests.
Per-org policy compilation cache
60 s TTL. The compiled policy AST is stored in-process. Policy updates (activate/archive) invalidate the cache immediately.
Principal attribute cache
5 min TTL. Principal role and group data is cached after first lookup. The cache is refreshed automatically on every IDP sync.
Per-query negative result cache
30 s TTL. Denied queries are cached by principal + resource set fingerprint to avoid redundant policy evaluation on rapid repeated attempts.
Throughput
A single backend pod sustains approximately 200 RPS. Scale horizontally — all state is in Postgres and workers are stateless. The auto-sync scheduler uses PostgreSQL advisory locks to coordinate across instances without double-execution.
Connector Latency
Approximate p95 latency for the vector query leg only (excludes policy evaluation overhead). Values are representative of cloud or locally-hosted deployments in the same region as the Gateco backend.
| Connector | Approximate p95 latency |
|---|---|
| pgvector (local) | 2–5 ms |
| Supabase | 15–30 ms |
| Neon | 15–35 ms |
| Qdrant (cloud) | 20–50 ms |
| Pinecone | 25–60 ms |
| Weaviate | 20–50 ms |
| OpenSearch | 30–70 ms |
| Milvus | 25–60 ms |
| Chroma | 10–30 ms |
| Azure AI Search | 30–80 ms |
| Vertex AI Vector Search | 30–80 ms |
| Vertex AI Search | 40–100 ms |
Getting Better Performance
- 1.
Use sidecar metadata resolution (default)
Sidecar mode reads metadata from Gateco's own registry — one indexed Postgres lookup per retrieval. Inline and sql_view modes add an extra network round-trip to the connector.
- 2.
Keep policy rule count under 20 per policy
Policy AST compilation cost is O(rules). Below 20 rules, compilation is sub-millisecond and effectively free after warm cache. Above 50 rules, consider splitting into multiple narrower policies.
- 3.
Enable principal caching
Principal attribute caching is on by default. Ensure IDP sync intervals are not so short that they continuously invalidate the cache — 30-minute intervals are a good baseline.