Skip to content

tenant-injector

Stable

Purpose

The tenant-injector plugin resolves tenant identity for every validated request. After the token-validator upstream has verified the JWT, tenant-injector calls a configurable tenant-directory HTTP endpoint (substituting the actor principal into the URL), caches the resolved tenant ID per principal, and injects X-Tenant-ID before forwarding — so upstream services receive tenant context without performing their own directory lookups.

Pipeline position:

flowchart LR
TV["token-validator"] --> TI["**tenant-injector**"]:::active
TI --> LC["license-check"]
LC --> PS["prompt-sanitize"]
PS --> OA["otel-audit"]
OA --> UP["upstream"]
classDef active fill:#4ade80,stroke:#16a34a,color:#14532d

tenant-injector is pipeline position 2 — runs after token-validator has verified the JWT and populated the actor principal.

Config

gateway.yaml (tenant-injector block)
tenant-injector:
enabled: true # required; false = Init error (defence-in-depth)
principal:
claim: sub # JWT claim used as principal identifier
lookup:
url: https://tenant-svc.internal/resolve/{principal} # {principal} substituted per request
method: GET # GET or POST (default: GET)
timeout_ms: 500 # default 500; max 30000
auth:
mode: none # none | bearer | mtls
bearer_token_env: TENANT_LOOKUP_TOKEN # env var with bearer token (mode: bearer)
client_cert_path: /etc/certs/client.pem # PEM cert (mode: mtls)
client_key_path: /etc/certs/client.key # PEM key (mode: mtls)
headers:
X-Internal-Caller: yaagents-gateway # optional extra request headers
response:
mode: single # only "single" in v0.4; multi planned v0.5+
tenant_id_field: tenant_id # top-level JSON field in lookup response
cache:
ttl_seconds: 300 # positive cache TTL (default 300)
negative_ttl_seconds: 30 # negative cache TTL on 404/parse-fail (default 30)
max_entries: 10000 # LRU bound (default 10000)
inject:
tenant_header: X-Tenant-ID # header injected into forwarded request
principal_header: X-Actor-Principal # optional; leave empty to disable
on_failure:
lookup_network_error: 503 # connection refused / DNS fail / reset (default 503)
lookup_timeout: 503 # timeout_ms exceeded (default 503)
principal_not_found: 403 # lookup returned 404 (default 403)
claim_missing: 401 # JWT lacked principal.claim (default 401)
allowlist: [] # optional: tenant IDs allowed post-derivation
FieldTypeDefaultRequiredDescription
enabledboolyesMust be true; false is rejected at Init.
principal.claimstring"sub"noJWT claim name to use as the principal identifier for URL substitution + cache key.
lookup.urlstring (URL template)yesLookup endpoint. Must contain exactly one {principal} placeholder.
lookup.methodstring"GET"noHTTP method for the lookup call. GET or POST.
lookup.timeout_msinteger500noPer-request lookup timeout. Max 30000.
lookup.auth.modestring"none"noAuth mode for the lookup call. none | bearer | mtls.
lookup.auth.bearer_token_envstringnoEnvironment variable name holding the bearer token. Used when mode: bearer.
lookup.auth.client_cert_pathstringnoPath to PEM client certificate. Used when mode: mtls.
lookup.auth.client_key_pathstringnoPath to PEM client key. Used when mode: mtls.
lookup.headersmap{}noExtra HTTP headers sent with every lookup request.
lookup.response.tenant_id_fieldstring"tenant_id"noTop-level JSON field name in the lookup response body.
lookup.cache.ttl_secondsinteger300noHow long to cache a successful lookup result per principal.
lookup.cache.negative_ttl_secondsinteger30noHow long to cache a negative result (404 or parse failure) per principal.
lookup.cache.max_entriesinteger10000noLRU cache capacity. Oldest entries evicted when full.
inject.tenant_headerstring"X-Tenant-ID"noHeader name injected into the forwarded request.
inject.principal_headerstring""noOptional header to forward the actor principal. Empty = disabled.
on_failure.<class>integersee defaultsnoHTTP status code returned for each failure class.
allowlistlist[]noIf non-empty, derived tenant ID must be in this list; otherwise principal_not_found status.

Request/Response

Reads from request

SourceFieldHow used
Request context (from token-validator)principal.claim (e.g. sub)Principal identifier; used as key for lookup URL substitution + LRU cache.
Inbound request headerinject.tenant_header (e.g. X-Tenant-ID)Stripped unconditionally before injection (anti-smuggling).

Writes to request (before forwarding upstream)

HeaderContentWhen injected
inject.tenant_header (X-Tenant-ID)Resolved tenant ID from lookup or cache.Always on successful resolution.
inject.principal_header (X-Actor-Principal)Actor principal value.When inject.principal_header is non-empty.

Writes to response

This plugin does not modify responses. It may return an early rejection response (see Status codes below) before any upstream contact.

Status codes the plugin can return early

StatusMedia typeWhen
401application/vnd.yaagents.error+jsonJWT lacked principal.claim (on_failure.claim_missing).
403application/vnd.yaagents.error+jsonLookup returned 404 (principal unknown to tenant directory), or resolved tenant not in allowlist (on_failure.principal_not_found).
503application/vnd.yaagents.error+jsonLookup network error or timeout (on_failure.lookup_network_error / lookup_timeout).

Security & privacy

What this plugin trusts

  • The actor principal injected into the reqctx by token-validator upstream (verified JWT claim; not re-validated here).
  • The tenant-directory HTTP response body for the field named lookup.response.tenant_id_field — the plugin parses exactly that field; other fields are ignored.
  • The lookup.auth credentials (bearer token from env var, mTLS cert from file mount) — validated at Init; never reloaded at runtime without restart.

What this plugin protects

  • Tenant-header smuggling: strips any inbound inject.tenant_header value from the request unconditionally, before injection. A client that sends X-Tenant-ID: attacker-tenant cannot influence the injected value — the stripped value is replaced with the gateway-resolved one.
  • Disabled-by-config bypass: enabled: false is rejected at gateway Init; the gateway exits rather than starting without tenant injection active. This prevents misconfiguration from silently skipping the plugin.
  • Allowlist gate: when allowlist is non-empty, a resolved tenant ID that is not in the list returns principal_not_found — prevents a compromised or newly-created tenant-directory entry from gaining access by merely existing.

PII boundary

The actor principal (typically a sub claim value, e.g. a UUID or opaque ID) is used as the lookup URL path component and as the LRU cache key. Principal values are logged at WARN level when lookup fails (to aid diagnostics); they are NOT logged on the success path. The resolved tenant ID is injected into the request header but never written to spans or log lines on the happy path.

inject.principal_header (X-Actor-Principal) is an optional downstream header — when enabled, the principal value reaches upstream services and appears in their access logs. Consider whether your upstreams log all request headers before enabling this.

Secrets handling

  • Bearer token: read from the environment variable named in lookup.auth.bearer_token_env at Init. Never stored in gateway YAML or logs.
  • mTLS certificates: loaded from the file paths in lookup.auth.client_cert_path / lookup.auth.client_key_path at Init. Paths are config; key material is not logged.
  • No per-request secret: the lookup URL uses {principal} substitution only — no JWT or token material is forwarded to the tenant directory.

Observability

Spans / events emitted

Span nameAttributesWhen emitted
tenant.resolveprincipal (redacted to prefix), outcome (cache_hit | lookup_ok | not_found | error), cache_age_sEvery request.
tenant.lookupurl (path only, no query), status, latency_msOn cache miss — when an HTTP lookup is performed.

Bench baseline (BENCH-2; commit 7d0dea0; 2026-06-07): p99 overhead +10.3 ms vs no-plugin baseline at 100 RPS with cache-hit/miss mix. Warm-cache (>80% hit rate under steady-state principal rotation) reduces marginal overhead to near-zero; the 10.3 ms figure includes cold-start principal lookup latency from the mock webhook server.

Log lines

{"level":"INFO","msg":"tenant.resolve","outcome":"cache_hit","principal_prefix":"usr-abc","request_id":"req-001"}
{"level":"WARN","msg":"tenant.lookup","outcome":"not_found","principal_prefix":"usr-xyz","status":404,"request_id":"req-002"}
{"level":"WARN","msg":"tenant.lookup","outcome":"network_error","error":"connection refused","request_id":"req-003"}

Principal values in log lines are truncated to a prefix (first 8 characters + ) to avoid full principal leakage in logs. Full principal is never logged.

Metrics

MetricTypeLabelsDescription
yaagents_plugin_tenant_resolve_totalcounteroutcomeCumulative resolutions by outcome (cache_hit, lookup_ok, not_found, error).
yaagents_plugin_tenant_lookup_duration_secondshistogramstatusHTTP lookup latency when a network call is made.
yaagents_plugin_tenant_cache_sizegaugeCurrent LRU cache occupancy (entries in use).

Correlation-id propagation

Reads X-Correlation-ID from the inbound request and attaches it as the correlation_id attribute on the tenant.resolve and tenant.lookup spans. Also set as the outbound X-Correlation-ID header on the tenant-directory lookup call — so the lookup request appears in the tenant-directory’s own traces with the same correlation ID as the original gateway request.

Failure modes

FailureConfigurable behaviorWhat the client sees
principal.claim absent from JWTon_failure.claim_missing (default 401)401 application/vnd.yaagents.error+json
Lookup: connection refused / DNS failon_failure.lookup_network_error (default 503)503 application/vnd.yaagents.error+json
Lookup: timeout exceededon_failure.lookup_timeout (default 503)503 application/vnd.yaagents.error+json
Lookup: directory returns 404on_failure.principal_not_found (default 403)403 application/vnd.yaagents.error+json; negatively cached for negative_ttl_seconds.
Lookup: non-2xx (other than 404)Treated as network-class error → lookup_network_error503 application/vnd.yaagents.error+json
Lookup: 2xx but tenant_id_field absentTreated as parse failure → negatively cached → lookup_network_error503 application/vnd.yaagents.error+json
Allowlist gate: derived tenant not in liston_failure.principal_not_found (default 403)403 application/vnd.yaagents.error+json
enabled: false in configFixed Init error — gateway exits 1Gateway does not start.