Dead Letter Queue
A dead letter queue (DLQ) is a holding queue that receives messages that cannot be successfully processed after exhausting all configured retry attempts, preventing bad messages from blocking normal queue operations.
A dead letter queue (DLQ) is a holding queue that receives messages that cannot be successfully processed after exhausting all configured retry attempts, preventing bad messages from blocking normal queue operations.
Every production messaging system needs a safety valve for messages that fail repeatedly. Without a DLQ, a "poison message" — one that consistently causes consumer errors, perhaps due to malformed data or a schema mismatch — will retry indefinitely, consuming resources and potentially starving the queue. The DLQ pattern moves these messages out of the hot path while preserving them for diagnosis and potential replay.
Messages arrive in a DLQ for several reasons: they exceed the maximum delivery count (the most common case, covered in detail in Message Queue Retry); they exceed the queue's message TTL without being consumed; or the destination queue is full and cannot accept new messages at time of routing (in RabbitMQ's x-dead-letter-exchange configuration).
Once in the DLQ, messages should trigger an alert to an on-call engineer. The message is preserved with its original payload plus metadata: the reason it was dead-lettered, the original queue it came from, the time of failure, and the last exception message. This metadata is invaluable for debugging.
After the root cause is fixed — a bug deployed, a downstream service recovered, or a schema migration applied — messages can be replayed from the DLQ back to the original queue. Most cloud-managed services (AWS SQS, Azure Service Bus, GCP Pub/Sub) provide native DLQ replay tooling. In homegrown systems, a replay script reads from the DLQ and republishes to the source queue. Replayed messages should be processed by an Idempotent Consumer to handle any duplicates from prior partial processing.