Skip to main content

IT-Checklists.com - The eBook-Shop with Checklists and Templates for Professionals
logo IT Checklists
Skip main navigation
Template Systems Operations ManualTemplate Data Centre Operations ManualData Migration ChecklistNonfunctional RequirementsApplication Interface (EAI) Checklist Server Upgrade / Migration ChecklistApplication Upgrade / Migration ChecklistApplication & Server Inventory TemplateRelease ChecklistOutage PlanningApplication Environment / Server CloningApplication RetirementApplication Health checkArchiving RequirementsDisaster Recovery (DR) Technology SelectionBackup OLA / SLADatabase OLA / SLADBA Job DescriptionDatabase Health CheckStandby Database
Application SupportData MigrationOLA / SLA Operations Level AgreementSystem DocumentationProject ManagementQuality AssuranceCompliance and StandardsDatabase AdministrationStartup PhaseAchieving Operational ReadinessStabilized Operations

Template for IT Outage Planning and Notification

Summary and Scope

Outage Planning is a sub-task of Maintenance Planning.
This template with in-depth explanations of tasks required to plan outages on medium to large IT Systems will avoid that important steps are foregotten.
The initial, one-off step when using this generic template is to customize the template for your organization, e.g. to fill in the enterprise wide generic information, e.g. contact persons in departments involved in planning and approving outages. Sometimes the templates and checklists might be reformatted according company standards.
Afterwards this customized template is available for planning individual outages.
In case that in your company the position of "Central Outage Coordinator / Manager", probably supported by a dedicated team, exists since long time, than you don't need this document. But in this case you already know about the complexity of planning and coordinating outages.
Secondary use of this document:
When a new, very important IT system is just in the planning phase, then use this generic document and customize it for your new system. This exercise will validate related information which needs to be part of the operations manual or help to complete this important system specific information in the operations manual.
Of course we can't guarantee that this template is 100% complete - but this template definitely helps to avoid the most common pitfalls.

Overview

The new software release is fully tested, new hardware parts are delivered. But until the application can be finally shut down to install a new software version or add/replace hardware following steps - explained in detail on 23 pages - need to be executed:
1) Requesting the Outage
Initiating the outage planning, typically by using a form which contains first information about system which should be shut down
2) Impact Analysis - Evaluation
checks which other direct or in-direct interfacing systems and group of users are affected. Some systems might be strongly affected as they access the application for which the outage is planned during normal processing. Other systems might be less affected, but they might receive their nightly data update later, or one of their jobs is in a job-chain which will be started delayed due to the outage.
3) Impact Analysis – Summary
This section summarizes all details identified in the evaluation step and helps to better understand the end-to-end impact of shutting down one single application.
4) Detailed planning - preparing of the outage.
This includes
  • staff planning for the time during the outage and afterwards until all jobs are back on normal schedule,
  • requesting an additional backup before changes are implemented,
  • informing the monitoring team which system need to be excluded from monitoring – or where alarms will be raised
  • communicating the outage to external partners and customers (if applicable)
  • communicating the outage to all affected internal staff.
5) Detailed planning – Action Plan
Detailed list of actions including time- and notification plan.
6) Approval / Sign-Off
After all affected Business Departments and all participating IS/IT departments reviewed and accepted the outage plan, finally the Change Manager and/or Operations Manager review and approve it or request corrections or additions.
Note: In many companies those approvals are issued in weekly change request / approval meetings where the request and the planned actions are reviewed and the requester (and his expert) might need to answer questions from the change manager or operations manager. Missing such dates might postpone the outage for another week.
It is obvious that the process of requesting and planning the outage needs to be initiated sufficient time in advance.

Frequently Asked Questions (FAQ)

Question: We invested huge money in a "High Availability (HA)" deployment of your important application. Why should do we need a planned outage?

Answer: A "High Available (HA)" Solution strives to eliminate unplanned outages caused by hardware failures, but usually does not support upgrades without downtime. "Rolling Upgrades" without requiring an planned outage require deployment of level "Continuous Availability" which is higher than just "High Availability".