2 min read

Scalable WhatsApp Business API architecture for production

Moving from a working prototype to a stable production system on WhatsApp Business API requires precise architectural decisions. Rate limits, error handling, async queues and monitoring are the four pillars of a robust implementation.

Scalable WhatsApp Business API architecture for production

Rate limits and send speed management

WhatsApp Business API enforces a default rate limit of 80 messages per second per phone number. This value can be increased on request for high-volume accounts with a solid quality history. Exceeding the rate limit generates 429 errors that your system must handle with exponential backoff.

For systems that need to send thousands of messages in a short time, the correct architecture involves a message queue (RabbitMQ, Redis Queue, AWS SQS) that regulates flow toward the API. The producer adds messages to the queue at any speed; the consumer extracts and sends them respecting the rate limit.

Error handling and retry logic

WhatsApp message sending errors fall into two categories: transient errors (network, timeout, rate limit) and permanent errors (number not registered on WhatsApp, message rejected for policy). Only the former should be retried.

Implement retry logic with exponential backoff: first retry after 1 second, second after 2 seconds, third after 4 seconds. After three failed attempts on a transient error, move the message to a dead letter queue for manual analysis.

Message idempotency

Associate a unique ID with each message before sending and save it in the database. If a retry sends the same message twice, the unique ID lets you detect the duplicate and ignore it, preventing the recipient from receiving the same message multiple times.

Monitoring and observability

A production WhatsApp system requires real-time operational metrics: messages sent per minute, delivery rate, error rate by type, average latency from send to 'delivered' webhook event.

Configure automatic alerts on critical thresholds: error rate above 2%, average latency above 5 seconds, growing message queue not being drained. These alerts enable proactive intervention before problems become visible to end customers.

Chat API exposes status events via webhook for every sent message (sent, delivered, read, failed). Consuming these events and updating them in a database allows calculating service quality metrics in real time to feed monitoring dashboards.

Chat API

Ready to integrate WhatsApp into your business?

Activate your Chat API account and start sending messages in minutes.