Performance & Latency
Gateco adds a policy evaluation layer between your AI agent and your vector database. This overhead is deterministic and bounded.
p95 Latency
25 ms overhead above the raw vector DB call, measured against a pgvector baseline with a 5-rule RBAC policy and 100 resources. Policy evaluation time is dominated by principal attribute lookup and AST traversal, both O(rules) and cached across requests.
Caching Layers
Gateco applies three caching layers to reduce policy evaluation overhead on repeated requests.
Per-org policy compilation cache
60 s TTL. The compiled policy AST is stored in-process. Policy updates (activate/archive) invalidate the cache immediately.
Principal attribute cache
5 min TTL. Principal role and group data is cached after first lookup. The cache is refreshed automatically on every IDP sync.
Per-query negative result cache
30 s TTL. Denied queries are cached by principal + resource set fingerprint to avoid redundant policy evaluation on rapid repeated attempts.
Throughput
A single backend pod sustains approximately 200 RPS. Scale horizontally. All state is in Postgres and workers are stateless. The auto-sync scheduler uses PostgreSQL advisory locks to coordinate across instances without double-execution.
Connector Latency
Approximate p95 latency for the vector query leg only (excludes policy evaluation overhead). Values are representative of cloud or locally-hosted deployments in the same region as the Gateco backend.
| Connector | Approximate p95 latency |
|---|---|
| pgvector (local) | 2–5 ms |
| Supabase | 15–30 ms |
| Neon | 15–35 ms |
| Qdrant (cloud) | 20–50 ms |
| Pinecone | 25–60 ms |
| Weaviate | 20–50 ms |
| OpenSearch | 30–70 ms |
| Milvus | 25–60 ms |
| Chroma | 10–30 ms |
| Azure AI Search | 30–80 ms |
| Vertex AI Vector Search | 30–80 ms |
| Vertex AI Search | 40–100 ms |
Getting Better Performance
- 1.
Use sidecar metadata resolution (default)
Sidecar mode reads metadata from Gateco's own registry, one indexed Postgres lookup per retrieval. Inline and sql_view modes add an extra network round-trip to the connector.
- 2.
Keep policy rule count under 20 per policy
Policy AST compilation cost is O(rules). Below 20 rules, compilation is sub-millisecond and effectively free after warm cache. Above 50 rules, consider splitting into multiple narrower policies.
- 3.
Enable principal caching
Principal attribute caching is on by default. Ensure IDP sync intervals are not so short that they continuously invalidate the cache. A 30-minute interval is a good baseline.