Rate limits and send speed management
WhatsApp Business API enforces a default rate limit of 80 messages per second per phone number. This value can be increased on request for high-volume accounts with a solid quality history. Exceeding the rate limit generates 429 errors that your system must handle with exponential backoff.
For systems that need to send thousands of messages in a short time, the correct architecture involves a message queue (RabbitMQ, Redis Queue, AWS SQS) that regulates flow toward the API. The producer adds messages to the queue at any speed; the consumer extracts and sends them respecting the rate limit.
Error handling and retry logic
WhatsApp message sending errors fall into two categories: transient errors (network, timeout, rate limit) and permanent errors (number not registered on WhatsApp, message rejected for policy). Only the former should be retried.
Implement retry logic with exponential backoff: first retry after 1 second, second after 2 seconds, third after 4 seconds. After three failed attempts on a transient error, move the message to a dead letter queue for manual analysis.
Message idempotency
Associate a unique ID with each message before sending and save it in the database. If a retry sends the same message twice, the unique ID lets you detect the duplicate and ignore it, preventing the recipient from receiving the same message multiple times.
Monitoring and observability
A production WhatsApp system requires real-time operational metrics: messages sent per minute, delivery rate, error rate by type, average latency from send to 'delivered' webhook event.
Configure automatic alerts on critical thresholds: error rate above 2%, average latency above 5 seconds, growing message queue not being drained. These alerts enable proactive intervention before problems become visible to end customers.
Chat API exposes status events via webhook for every sent message (sent, delivered, read, failed). Consuming these events and updating them in a database allows calculating service quality metrics in real time to feed monitoring dashboards.