MojaedgeStart trial
Operations

Disaster recovery runbook

What we'll do if something serious goes wrong, and what you can do today to be ready.

Tiers of failure (and our response)

1. Process crash — auto-restart within seconds. 2. Container failure — same, no customer impact. 3. Host failure — manual restore to a standby VM from nightly backup. RTO ~2h. 4. Data centre failure — switch to standby DC. RTO ~4h. 5. Catastrophic loss — restore from off-region encrypted backups. RPO ≤ 24h.

What we do for you

Each tenant's site is backed up nightly into sites/<tenant>/private/backups/ (files + DB + settings). Encrypted off-host snapshots. A nightly backup-drill picks one tenant, restores into a sandbox, runs a sanity probe. Failures land in HQ + Telegram.

What you can do today

1. Enable warehouse so ClickHouse-side history covers your restore window. 2. Designate a recovery contact (phone + email) in your tenant. 3. Keep at least two ops users in HQ with TOTP enrolled. 4. Keep Knowledge Documents current — those drive customer self-service during a recovery window.

Communications during an incident

Updates on status.mojaedge.com every 30 min until resolved. SMS + email to your designated recovery contact. Telegram/Slack if subscribed.


Missing something? Tell us what you needed.