Performance & Latency

Gateco adds a policy evaluation layer between your AI agent and your vector database. This overhead is deterministic and bounded.

p95 Latency

25 ms overhead above the raw vector DB call, measured against a pgvector baseline with a 5-rule RBAC policy and 100 resources. Policy evaluation time is dominated by principal attribute lookup and AST traversal — both are O(rules) and cached across requests.

Caching Layers

Gateco applies three caching layers to reduce policy evaluation overhead on repeated requests.

Per-org policy compilation cache

60 s TTL. The compiled policy AST is stored in-process. Policy updates (activate/archive) invalidate the cache immediately.

Principal attribute cache

5 min TTL. Principal role and group data is cached after first lookup. The cache is refreshed automatically on every IDP sync.

Per-query negative result cache

30 s TTL. Denied queries are cached by principal + resource set fingerprint to avoid redundant policy evaluation on rapid repeated attempts.

Throughput

A single backend pod sustains approximately 200 RPS. Scale horizontally — all state is in Postgres and workers are stateless. The auto-sync scheduler uses PostgreSQL advisory locks to coordinate across instances without double-execution.

Connector Latency

Approximate p95 latency for the vector query leg only (excludes policy evaluation overhead). Values are representative of cloud or locally-hosted deployments in the same region as the Gateco backend.

ConnectorApproximate p95 latency
pgvector (local)2–5 ms
Supabase15–30 ms
Neon15–35 ms
Qdrant (cloud)20–50 ms
Pinecone25–60 ms
Weaviate20–50 ms
OpenSearch30–70 ms
Milvus25–60 ms
Chroma10–30 ms
Azure AI Search30–80 ms
Vertex AI Vector Search30–80 ms
Vertex AI Search40–100 ms

Getting Better Performance

  • 1.

    Use sidecar metadata resolution (default)

    Sidecar mode reads metadata from Gateco's own registry — one indexed Postgres lookup per retrieval. Inline and sql_view modes add an extra network round-trip to the connector.

  • 2.

    Keep policy rule count under 20 per policy

    Policy AST compilation cost is O(rules). Below 20 rules, compilation is sub-millisecond and effectively free after warm cache. Above 50 rules, consider splitting into multiple narrower policies.

  • 3.

    Enable principal caching

    Principal attribute caching is on by default. Ensure IDP sync intervals are not so short that they continuously invalidate the cache — 30-minute intervals are a good baseline.