emailr_
All articles
usecase·9 min

Email failover: Building redundancy

reliabilityfailoverarchitecture

Email is critical infrastructure. When your primary provider goes down, you need failover. Here's how to build redundant email systems.

Why failover matters

  • Provider outages happen
  • API rate limits can block sending
  • DNS issues affect deliverability
  • Regional failures impact availability

Multi-provider architecture

Provider abstraction

interface EmailProvider {
  name: string;
  send(email: Email): Promise<SendResult>;
  checkHealth(): Promise<boolean>;
  getRateLimit(): Promise<RateLimitStatus>;
}

class EmailrProvider implements EmailProvider {
  name = 'emailr';
  
  async send(email: Email): Promise<SendResult> {
    return this.client.emails.send(email);
  }
  
  async checkHealth(): Promise<boolean> {
    try {
      await this.client.health.check();
      return true;
    } catch {
      return false;
    }
  }
}


class SendGridProvider implements EmailProvider {
  name = 'sendgrid';
  // ... implementation
}

Failover logic

class EmailService {
  private providers: EmailProvider[];
  private primaryIndex = 0;
  
  constructor(providers: EmailProvider[]) {
    this.providers = providers;
  }
  
  async send(email: Email): Promise<SendResult> {
    const provider = this.providers[this.primaryIndex];
    
    try {
      if (!await provider.checkHealth()) {
        return this.failover(email);
      }
      
      return await provider.send(email);
    } catch (error) {
      return this.failover(email, error);
    }
  }
  
  private async failover(email: Email, error?: Error): Promise<SendResult> {
    for (let i = 0; i < this.providers.length; i++) {
      if (i === this.primaryIndex) continue;
      
      const provider = this.providers[i];
      
      try {
        if (await provider.checkHealth()) {
          const result = await provider.send(email);
          
          // Log failover event
          await logFailover(this.providers[this.primaryIndex].name, provider.name, error);
          
          return result;
        }
      } catch (e) {
        continue;
      }
    }
    
    throw new Error('All email providers failed');
  }
}

Health checking

Proactive health monitoring

class HealthChecker {
  private healthStatus: Map<string, boolean> = new Map();
  
  async startMonitoring(providers: EmailProvider[], interval = 30000) {
    setInterval(async () => {
      for (const provider of providers) {
        const healthy = await provider.checkHealth();
        this.healthStatus.set(provider.name, healthy);
        
        if (!healthy) {
          await alertOps(`Provider ${provider.name} unhealthy`);
        }
      }
    }, interval);
  }
  
  isHealthy(providerName: string): boolean {
    return this.healthStatus.get(providerName) ?? true;
  }
}

Circuit breaker pattern

class CircuitBreaker {
  private failures = 0;
  private lastFailure?: Date;
  private state: 'closed' | 'open' | 'half-open' = 'closed';
  
  private readonly threshold = 5;
  private readonly timeout = 60000; // 1 minute
  
  async execute<T>(fn: () => Promise<T>): Promise<T> {
    if (this.state === 'open') {
      if (Date.now() - this.lastFailure!.getTime() > this.timeout) {
        this.state = 'half-open';
      } else {
        throw new Error('Circuit breaker open');
      }
    }
    
    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      throw error;
    }
  }
  
  private onSuccess() {
    this.failures = 0;
    this.state = 'closed';
  }
  
  private onFailure() {
    this.failures++;
    this.lastFailure = new Date();
    
    if (this.failures >= this.threshold) {
      this.state = 'open';
    }
  }
}

Load balancing

Weighted distribution

class LoadBalancer {
  private providers: Array<{ provider: EmailProvider; weight: number }>;
  
  selectProvider(): EmailProvider {
    const totalWeight = this.providers.reduce((sum, p) => sum + p.weight, 0);
    let random = Math.random() * totalWeight;
    
    for (const { provider, weight } of this.providers) {
      random -= weight;
      if (random <= 0) {
        return provider;
      }
    }
    
    return this.providers[0].provider;
  }
}

// Usage: 70% primary, 30% secondary
const balancer = new LoadBalancer([
  { provider: emailr, weight: 70 },
  { provider: sendgrid, weight: 30 }
]);

DNS-based failover

Multiple MX records

; Primary provider
@ MX 10 mx1.emailr.dev.
@ MX 10 mx2.emailr.dev.

; Backup provider
@ MX 20 mx1.backup-provider.com.
@ MX 20 mx2.backup-provider.com.

Graceful degradation

async function sendWithDegradation(email: Email): Promise<SendResult> {
  try {
    // Try full-featured send
    return await primaryProvider.send(email);
  } catch (error) {
    // Degrade to basic send (no tracking, simpler template)
    const degradedEmail = {
      ...email,
      html: stripTracking(email.html),
      trackOpens: false,
      trackClicks: false
    };
    
    return await backupProvider.send(degradedEmail);
  }
}

Best practices

  1. Multiple providers - At least two for redundancy
  2. Health checks - Proactive monitoring
  3. Circuit breakers - Prevent cascade failures
  4. Graceful degradation - Partial functionality beats none
  5. Alert on failover - Know when it happens
  6. Test failover - Regular drills

Email failover is insurance. You hope you never need it, but when you do, it's invaluable.

e_

Written by the emailr team

Building email infrastructure for developers

Ready to start sending?

Get your API key and send your first email in under 5 minutes. No credit card required.