Skip to content

Node Pruning

Over time, attested node records accumulate in the Cofide SPIRE datastore. When a node is decommissioned, its SPIRE agent stops renewing its certificate, but the node record remains until it is explicitly removed. In long-lived clusters where nodes rotate regularly, this causes the datastore to grow unboundedly and makes the list of attested nodes increasingly noisy.

Node pruning is disabled by default. This page explains how to enable it and how to choose the right settings for your environment.

When pruning is enabled, Cofide SPIRE runs a background job approximately every hour that deletes node records whose certificate has been expired for longer than a configurable grace period. The grace period exists to avoid deleting records for nodes that are temporarily offline and will return to service.

Two Helm values control this behaviour:

  • pruneAttestedNodesExpiredFor sets the grace period as a Go duration string (e.g. "24h", "168h"). Setting this value enables pruning. Leaving it empty (the default) disables it.
  • pruneTOFUNodes controls whether non-reattestable nodes are included. This must be set to true for cloud provider nodes to be pruned (see below). It has no effect if pruneAttestedNodesExpiredFor is not set.

The appropriate settings depend on which node attestors are in use in the trust domain. The key distinction is whether a node can re-attest automatically if its record is deleted.

AttestorRe-attestableBehaviour after pruning
Kubernetes PSATYesThe SPIRE agent re-attests automatically on restart. Transparent to workloads.
TPMYesThe SPIRE agent re-attests automatically on restart. Transparent to workloads.
AWS IIDNoThe SPIRE agent re-attests on its next startup once the old record is removed.
Azure IMDS / MSINoSame as AWS IID.
GCP IITNoSame as AWS IID.

For re-attestable nodes, pruning is low-risk: a node that comes back online simply re-attests without any manual intervention.

For non-reattestable, also known as Trust On First Use (TOFU), nodes, the behaviour is different. TOFU semantics exist to prevent identity reuse: once a node has attested with a given piece of evidence, no other node can use the same evidence to claim the same identity. See the SPIRE AWS IID security considerations for background on why cloud provider attestors use this restriction. The TOFU restriction blocks re-attestation for any node whose record still exists in the datastore, even if that node’s certificate has expired. This means a TOFU node that comes back online while its record is still present cannot re-attest and cannot rejoin the trust zone until the record is pruned. Once the record is removed, the same node can attest fresh when its agent next starts.

Non-reattestable nodes are excluded from pruning by default. Set pruneTOFUNodes: true to include them.

Trust domains using only Kubernetes PSAT and/or TPM

Section titled “Trust domains using only Kubernetes PSAT and/or TPM”
spire-server:
pruneAttestedNodesExpiredFor: "24h"
pruneTOFUNodes: false

Re-attestation is automatic and seamless for both attestors, so a short grace period is sufficient. A decommissioned node’s record is cleaned up within roughly 25 hours of the node disappearing.

Trust domains that include AWS IID, Azure, or GCP node attestors

Section titled “Trust domains that include AWS IID, Azure, or GCP node attestors”
spire-server:
pruneAttestedNodesExpiredFor: "168h"
pruneTOFUNodes: true

pruneTOFUNodes is false by default because enabling it changes the recovery behaviour for TOFU nodes in a way that requires deliberate consideration. Set it to true if you want records for AWS, Azure, and GCP nodes to be pruned; without it, those records accumulate indefinitely regardless of the grace period.

For TOFU nodes, the grace period should be set long enough to give operators time to identify and act on a node that has gone offline, rather than allowing it to re-attest automatically. Because a TOFU node whose record still exists cannot re-attest, a long grace period preserves the TOFU safety guarantee: a node cannot silently rejoin the trust zone after an extended outage without operator awareness. A grace period of "168h" (7 days) gives operators a reasonable window to investigate and take action before a decommissioned node’s record is cleaned up.

If a node needs to rejoin the trust zone before the grace period expires, an operator can manually delete its record using spire-server agent evict, which allows the node to re-attest immediately on its next startup.

The effective time between a node going offline and its record being cleaned up is approximately the node SVID TTL plus pruneAttestedNodesExpiredFor. With the default Cofide SPIRE node TTL of 1 hour, a 24-hour grace period means a decommissioned node’s record persists for approximately 25 hours after it disappears.

  • Attestation for an overview of the node attestors supported by Cofide SPIRE.