Performance & Latency

Gateco adds a policy evaluation layer between your AI agent and your vector database. This overhead is deterministic and bounded.

p95 Latency

25 ms overhead above the raw vector DB call, measured against a pgvector baseline with a 5-rule RBAC policy and 100 resources. Policy evaluation time is dominated by principal attribute lookup and AST traversal — both are O(rules) and cached across requests.

Caching Layers

Gateco applies three caching layers to reduce policy evaluation overhead on repeated requests.

Per-org policy compilation cache

60 s TTL. The compiled policy AST is stored in-process. Policy updates (activate/archive) invalidate the cache immediately.

Principal attribute cache

5 min TTL. Principal role and group data is cached after first lookup. The cache is refreshed automatically on every IDP sync.

Per-query negative result cache

30 s TTL. Denied queries are cached by principal + resource set fingerprint to avoid redundant policy evaluation on rapid repeated attempts.

Throughput

A single backend pod sustains approximately 200 RPS. Scale horizontally — all state is in Postgres and workers are stateless. The auto-sync scheduler uses PostgreSQL advisory locks to coordinate across instances without double-execution.

Connector Latency

Approximate p95 latency for the vector query leg only (excludes policy evaluation overhead). Values are representative of cloud or locally-hosted deployments in the same region as the Gateco backend.

Connector	Approximate p95 latency
pgvector (local)	2–5 ms
Supabase	15–30 ms
Neon	15–35 ms
Qdrant (cloud)	20–50 ms
Pinecone	25–60 ms
Weaviate	20–50 ms
OpenSearch	30–70 ms
Milvus	25–60 ms
Chroma	10–30 ms
Azure AI Search	30–80 ms
Vertex AI Vector Search	30–80 ms
Vertex AI Search	40–100 ms

Getting Better Performance

1.
Use sidecar metadata resolution (default)
Sidecar mode reads metadata from Gateco's own registry — one indexed Postgres lookup per retrieval. Inline and sql_view modes add an extra network round-trip to the connector.
2.
Keep policy rule count under 20 per policy
Policy AST compilation cost is O(rules). Below 20 rules, compilation is sub-millisecond and effectively free after warm cache. Above 50 rules, consider splitting into multiple narrower policies.
3.
Enable principal caching
Principal attribute caching is on by default. Ensure IDP sync intervals are not so short that they continuously invalidate the cache — 30-minute intervals are a good baseline.

← Back to Documentation