How do you guarantee idempotency and exactly-once writes?

Design RESTful idempotency with keys, ETags, retries, and a transactional outbox to survive partial failures.
Implement idempotent write endpoints with idempotency keys, conditional ETag requests, exactly-once semantics, safe retries, and an outbox to resist duplicates.

answer

Guarantee idempotency by assigning a client-supplied idempotency key to every write, storing request fingerprint and final result, and returning the same response on retry. Add conditional requests with ETags or If-Match to prevent lost updates. Achieve practical exactly-once semantics using a transactional outbox and consumer deduplication, not hope. Classify errors, use exponential backoff with jitter, and route poison requests to a dead-letter path. Persist operation status and correlate all retries for auditability.

Long Answer

Designing RESTful API writes for real reliability means assuming networks drop, clients retry, and downstream systems flap. The objective is twofold: (1) strong idempotency so any safe retry produces one logical effect, and (2) practical exactly-once semantics end to end, despite partial failures and at-least-once delivery. The strategy blends request deduplication, optimistic concurrency, transactional messaging, and disciplined retries.

1) Idempotency model and surface
Make idempotency explicit. For mutating endpoints (POST /payments, POST /orders), require a caller-supplied Idempotency-Key that uniquely identifies the logical operation. Persist the key with a request hash (method, path, normalized body) and the eventual result (status, response body, error code). On a retry with the same key, verify the fingerprint and return the previously stored outcome. If fingerprints differ, reject with a 409 Conflict to prevent key reuse for a different intent.

2) Storage and lifecycle of keys
Store keys in a durable, low-latency store with a sensible retention policy (for example, 24–72 hours or aligned to business reconciliation windows). Use a unique constraint on (idempotency_key, tenant) to eliminate races. Include a small state machine: pending → succeeded or failed_terminal. While pending, concurrent retries should block or poll until resolution, avoiding duplicate work.

3) Conditional requests with ETags
For updates on existing resources (PUT, PATCH, DELETE), enforce optimistic concurrency with ETags. The client reads the resource with its current ETag, then writes with If-Match: <etag>. The server verifies the precondition and either applies the change or returns 412 Precondition Failed. This prevents lost updates and complements idempotency: repeated PUT with the same body and ETag is harmless, while conflicting edits are surfaced deterministically.

4) Practical exactly-once via outbox and deduplication
End-to-end exactly-once semantics require control of both local writes and downstream effects. Use the transactional outbox pattern: in the same database transaction that persists the authoritative state change, insert an “outbox” event row describing the side effect (for example, “payment_authorized”). A reliable relay process reads the outbox and publishes to the message bus, marking events as sent atomically. Downstream consumers implement deduplication using a message operation_id or event_id with a uniqueness constraint or a processed key store. This pair yields exactly-once intent with at-least-once transport.

5) Retry policy and failure classification
Retries are only safe when both the API and downstream are idempotent. Classify errors:

  • Transient (timeouts, 5xx, 429) → retry with exponential backoff and jitter, bound by a maximum elapsed time.
  • Permanent (validation errors, 4xx except 409/429) → do not retry; record terminal failure against the key.
  • Unknown (client timeout after request possibly processed) → client must retry with the same idempotency key to resolve state.
    Return 202 Accepted for long-running operations and expose a GET /operations/{key} status endpoint to decouple client waits from server work.

6) Exactly-once within a single service
At the resource boundary, make writes idempotent with natural keys and conditional upserts. Prefer INSERT … ON CONFLICT DO NOTHING or UPSERT with a deterministic id (for example, payment_id derived from the idempotency key). For side-effectful calls (emails, webhooks), keep a send log keyed by (operation_id, channel) and short-circuit already-sent entries. This ensures that even if the handler runs twice, external effects occur once.

7) Concurrency control and race safety
For hot rows, combine ETags with short-lived row locks or compare-and-swap semantics. Access shared resources in a consistent order to reduce deadlocks. For high contention counters, use database sequences or atomic increments with idempotent reconciliation (for example, append-only ledger with periodic compaction) rather than in-place mutable totals.

8) Schema, hashing, and fingerprinting
Normalize the request before hashing: canonical JSON (sorted keys, trimmed whitespace), stable number formatting, and ignored benign fields (for example, idempotency key itself). Persist the canonical body alongside the hash for audits. Include tenant, user, and scope to prevent cross-tenant key collisions. Sign or HMAC the key server side if untrusted clients generate them.

9) Observability and governance
Emit structured logs and metrics: counts of idempotent hits vs new operations, conflicts, ETag precondition failures, outbox backlog, and consumer deduplications. Attach operation_id, idempotency_key, and resource identifiers to traces. Alert on abnormal pending durations and outbox growth. Provide an operator console to inspect an operation, replay an outbox event, or mark a poisoned payload as permanently failed with rationale.

10) Security and multi-tenant boundaries
Associate keys with tenant identity and scopes; never allow a key to be reused across tenants. Redact sensitive fields in logs and preserve only the minimal canonical body required for reconciliation. Keys should convey no secrets; treat them as identifiers, not as authorizations.

By blending idempotency keys, ETag-based conditional requests, transactional outbox with consumer dedupe, and disciplined retries, you convert unreliable networks and partial failures into predictable, safe write semantics. The system delivers one logical result per intent, is observable, and is operable under stress.

Table

Aspect Practice Implementation Outcome
Idempotency Client keys + fingerprint Idempotency-Key stored with canonical body and result Safe retries, consistent responses
Concurrency Conditional requests ETags with If-Match, 412 on mismatch No lost updates
Exactly-once Transactional outbox + dedupe Commit state + outbox in one TX; consumers de-dup by operation_id One logical effect end to end
Retries Backoff with jitter Classify transient vs permanent; cap attempts and elapsed time No retry storms
Storage Unique constraints (tenant, idempotency_key) unique; upserts by natural id Race-free writes
Observability Metrics + tracing Operation status, outbox backlog, conflict rate, DLQ size Fast triage and audits
Long-running Operation endpoint 202 with GET /operations/{key} Decoupled latency, fewer timeouts

Common Mistakes

Requiring retries but not supporting Idempotency-Key, causing double charges. Accepting keys but not persisting request fingerprints, allowing silent key reuse for different intents. Relying on at-least-once queues without a transactional outbox, so messages are lost or duplicated on crash. Skipping ETags, leading to lost updates in concurrent edits. Retrying all 4xx errors and flooding partners with bad payloads. Using long-lived locks instead of optimistic control, creating deadlocks. Forgetting deduplication in consumers, so emails or webhooks fire multiple times. Omitting structured observability, leaving operators blind to pending keys, outbox lag, and conflict spikes.

Sample Answers (Junior / Mid / Senior)

Junior:
“I require an Idempotency-Key on create operations and store the key with the response. If the client retries, I return the same result. For updates I use ETags with If-Match to prevent lost updates. I retry only on timeouts and 5xx with backoff.”

Mid:
“I persist a canonical request hash next to the Idempotency-Key and reject mismatched retries with 409. I implement a transactional outbox so that state changes and emitted events are atomic. Consumers de-dup by operation_id to achieve practical exactly-once semantics. I expose GET /operations/{key} for long tasks.”

Senior:
“I design end to end: keys and fingerprints at the API, ETag preconditions for updates, upserts by natural identifiers, and an outbox relay with idempotent consumers. Retry policy is classified and jittered. Observability tracks conflict rates, outbox backlog, and DLQ volume. Tenant scoping prevents cross-tenant key reuse, and audits include canonical bodies for reconciliation.”

Evaluation Criteria

Strong answers include client-supplied Idempotency-Key, stored fingerprint, and consistent replay; ETag conditional writes to avoid lost updates; transactional outbox with consumer dedupe for practical exactly-once semantics; and disciplined retries with exponential backoff and jitter. They mention unique constraints, canonicalization, a status endpoint for long-running operations, and observable metrics. Weak answers speak only about “using transactions” or “catching duplicates” without keys, outbox, or consumer dedupe. Red flags: retrying validation errors, ignoring concurrency control, or relying solely on the message bus for exactly-once guarantees.

Preparation Tips

Build a small API with POST /payments that requires Idempotency-Key. Persist (tenant, key, canonical_body_hash, status, response). Add a GET /operations/{key} endpoint. Implement updates with ETags and If-Match. Add a transactional outbox table and a relay that publishes events; make the consumer de-duplicate by operation_id. Classify retryable errors, add exponential backoff with jitter, and cap total elapsed time. Emit metrics for new vs replayed keys, 412 preconditions, outbox lag, and DLQ size. Write tests for duplicate requests, conflicting ETags, crash between commit and publish, and consumer replay. Document retention policy and tenant scoping.

Real-world Context

A payments API once double-charged when clients retried on timeouts. Introducing Idempotency-Key with canonical hashing and a send log eliminated duplicates. A marketplace lost edits due to concurrent PATCH calls; adding ETag If-Match stopped lost updates and clarified conflicts. A logistics platform dropped webhook events during deploys; a transactional outbox with idempotent consumers delivered one logical shipment creation despite restarts. After adding jittered retries, DLQs, and an operation status endpoint, support tickets fell, and operators could replay safely with audit trails.

Key Takeaways

  • Require Idempotency-Key and store a canonical request fingerprint and result.
  • Use ETag conditional requests to prevent lost updates and race conditions.
  • Achieve practical exactly-once semantics with a transactional outbox and consumer deduplication.
  • Classify errors and apply jittered backoff with caps; never retry validation failures.
  • Instrument operations, expose status endpoints, and enforce tenant-scoped uniqueness.

Practice Exercise

Scenario:
You own POST /orders and PATCH /orders/{id} in a high-traffic RESTful API. Mobile clients retry aggressively on poor networks. Downstream webhooks must fire exactly once per order. Incidents include double order creation, lost updates, and missing webhooks after restarts.

Tasks:

  1. Implement Idempotency-Key on POST /orders. Persist (tenant, key, canonical_body_hash, status, response, created_at) with a unique index. On retry with the same key and identical hash, replay the stored response; on hash mismatch, return 409 Conflict.
  2. Add ETag support to GET /orders/{id} and enforce If-Match on PATCH and DELETE. Return 412 on mismatch.
  3. Introduce a transactional outbox: in the same transaction that creates or updates the order, insert an outbox row for “order_created” or “order_updated.” Build a relay to publish events with operation_id and mark sent atomically.
  4. Make the webhook consumer idempotent by deduplicating on operation_id with a unique constraint or processed-keys store.
  5. Implement retries with exponential backoff and jitter for 5xx, timeouts, and 429; do not retry 4xx validation errors.
  6. Expose GET /operations/{key} returning pending, succeeded, or failed_terminal plus last response.
  7. Emit metrics for idempotent replays, ETag precondition failures, outbox backlog, and consumer deduplications.
  8. Write tests for duplicate client sends, race between commit and publish, consumer restart replay, and conflicting edits.

Deliverable:
An end-to-end design that proves idempotency, conditional requests, transactional outbox, and exactly-once semantics under retries and partial failures, with observability and safe replay.

Still got questions?

Privacy Preferences

Essential cookies
Required
Marketing cookies
Personalization cookies
Analytics cookies
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.