Multi-Channel Notification System
Built unified notification platform handling 50M+ notifications/day across email, push, SMS, and in-app channels with 99.95% delivery rate
Overview
Designed and built a centralized notification system to replace fragmented, channel-specific implementations scattered across multiple services
Problem
Notifications were implemented inconsistently across services. Some used third-party APIs directly, others had custom implementations. This led to duplicate notifications, inconsistent user preferences, no delivery tracking, and difficulty adding new channels.
Constraints
- Must handle 50M+ notifications per day at peak
- Delivery latency under 30 seconds for time-sensitive notifications
- Must support user preferences and quiet hours
- Budget constraints on third-party provider costs
Approach
Built a unified notification service with a clean API that abstracts channel-specific complexity. Implemented intelligent routing that respects user preferences, handles retries, and optimizes for cost across providers. Used event-driven architecture for scalability.
Key Decisions
Use multi-provider strategy with intelligent routing
No single provider is best for all use cases. SendGrid for transactional email, Twilio for SMS, Firebase for push. Intelligent routing optimizes for cost and deliverability based on notification type and recipient.
- Single provider for all channels
- Build in-house delivery infrastructure
Implement notification deduplication at the platform level
Duplicate notifications were a major user complaint. Platform-level deduplication with configurable windows prevents duplicates regardless of how many times upstream services trigger the same notification.
- Rely on upstream services to deduplicate
- Client-side deduplication
Use Apache Kafka for notification queue
Kafka's durability guarantees and replay capability are essential for notifications. We can't lose notifications, and we need the ability to replay failed batches. The partitioning model also enables parallel processing.
Tech Stack
- Go
- Apache Kafka
- PostgreSQL
- Redis
- SendGrid
- Twilio
- Firebase
- Kubernetes
Result & Impact
- 99.95% (up from 94%)Delivery Rate
- 50M+ notificationsDaily Volume
- Reduced by 95%Duplicate Complaints
- 30% reduction through smart routingProvider Costs
Product teams can now add notifications to features in minutes instead of days. User preference management has dramatically reduced notification fatigue complaints. The delivery analytics have enabled data-driven optimization of notification strategies.
Learnings
- Notification fatigue is real—build preference management from day one
- Delivery tracking is essential for debugging and optimization
- Provider failover needs to be automatic—manual intervention is too slow
- Template management becomes complex at scale—invest in good tooling
Handling Scale
The system processes 50M+ notifications daily with significant spikes during marketing campaigns. Kafka partitioning by user ID ensures ordering guarantees while enabling horizontal scaling.
We implemented backpressure mechanisms to prevent overwhelming downstream providers during spikes. The system automatically queues lower-priority notifications when approaching rate limits.
User Preference Complexity
User preferences turned out to be surprisingly complex. Users want different settings per channel, per notification type, and even per time of day. We built a flexible preference model that supports these combinations without becoming unwieldy.
The quiet hours feature was particularly appreciated—no more 3 AM push notifications for non-urgent updates.