BYOK: Per-Org LLM Keys, KMS Encryption, and How the Credit Model Works
Grounded Answers — policy-aware answer synthesis that returns cited responses from policy-filtered retrieval chunks — requires an LLM API call to synthesize the final answer. Until now, Gateco's shared OpenAI key handled that call for all paid-tier organizations. Starting today, each organization can supply its own OpenAI API key, stored encrypted and isolated by tenant. This is the BYOK (bring your own key) model, and it changes how costs, data handling, and key rotation work.
How to add your key
Navigate to Organization Settings → API Keys → LLM API Key. Add your OpenAI API key. The key is encrypted immediately on submission — the plaintext is never written to the database. Once configured, every Grounded Answers request for your organization uses your key rather than Gateco's fallback. Key rotation works the same way: add a new key, and Gateco switches immediately. There is no overlap period where both keys are active.
The credit model for organizations without their own key
Paid-tier organizations (Team, Growth, Enterprise) that have not yet configured their own key receive 100 lifetime fallback synthesis calls on Gateco's shared OpenAI key. This covers initial evaluation and onboarding. Once those 100 calls are used, the answer endpoint returns 422 with error code LLM_CREDIT_EXHAUSTED — the policy-filtered retrieval chunks are still returned, only the synthesis step is blocked. Free-tier organizations have no fallback allocation and must configure their own key to use Grounded Answers at all.
The 100-call limit is per organization, not per user or per month. It does not reset. It is designed as an evaluation window, not ongoing capacity. If your team intends to use Grounded Answers in production, configure your own key before the limit is reached.
How the key is stored: envelope encryption with per-tenant KMS context
Each organization's LLM key is stored using envelope encryption: a data encryption key (DEK) generated per organization encrypts the API key with AES-256-GCM, then the DEK itself is wrapped by a KMS customer master key. The critical detail is the EncryptionContext — the DEK is wrapped with the organization_id as a KMS context binding. This means the KMS service will only decrypt a DEK if the caller presents the correct organization ID. An API key encrypted for Organization A cannot be decrypted with Organization B's context — the KMS enforces this at the cryptographic layer, not the application layer.
In production, this uses AWS KMS with a customer-managed key. The connector credential encryption uses the same architecture. If you have already verified that connector credentials are isolated between organizations, LLM key storage follows the same guarantees.
Why this matters beyond cost
Cost is the obvious reason — using your own OpenAI key means LLM spend goes directly to your account, with your billing, your rate limits, and your usage visibility. But for enterprise customers, data handling is equally important. When Grounded Answers runs against your retrieval chunks using your key, the API call goes from Gateco's synthesis service to OpenAI's API with your credentials. The relationship between the data and the API key is explicit and auditable. Organizations that need full control over which model handles synthesis can configure the model via the GATECO_ANSWER_MODEL environment variable on self-hosted deployments. The default is gpt-4o-mini.
Related reading
← Previous
Why Gateco Denies on Policy Errors — and When to Change It
Next →
EU AI Act: 67 Days to Enforcement — The RAG Pipeline Checklist
Ready to secure your AI retrieval?
Start with the free tier — 100 retrievals/month, no credit card required.