Today, businesses and customers have one clear expectation: that their apps and devices will connect and work seamlessly all the time. This expectation assumes that the service provider has mechanisms in place to ensure that digital services never fail.
Unfortunately, that isn’t the reality. Technology systems are sometimes prone to failure. They’re certainly subject to interruptions outside their control.
Disruptions of all sorts can severely cripple an organization’s ability to deliver digital services. Natural disasters, cyberattacks, power outages, fires, floods, large-scale IT infrastructure failures, or events that cause key personnel or supplier unavailability. You need a continuity plan ready for these disruptions.
If you need to recover quickly or limit the extent of disruption to your services, planning for continuity is essential. Failure to do so may lead to serious consequences — including the entire loss of the enterprise. According to FEMA, about 25% of businesses do not reopen after disasters.
One of the key activities within the continuity planning process is conducting a business impact analysis (BIA). The ISO 22301 standard on business continuity management systems defines:
A BIA as the process of analyzing the impact over time of a disruption on the organization.
In this article, we will delve into the objectives of the BIA, steps involved, process outputs, and some recommendations for getting the most out of this critical process.
Per NIST SP 800-34 guidance, the purpose of Business Impact Analysis is to correlate an information system with the critical mission/business processes and services provided and, based on that information, characterize the consequences of a disruption.
The BIA is an integral part of the business continuity management system. Before identifying which business continuity strategies and solutions to implement, organizations should complete a cycle of the BIA process.
The BIA is distinctly different from a risk assessment. However, they do share synergies:
The risk mitigation actions can be informed and prioritized by the outcome of the BIA exercise.
Additionally, the BIA can receive input from the risk assessment process that will inform the extent of disruption should a risk materialize.
The BIA process uses impact types and criteria to assess the impacts over time resulting from the disruption of business activities that deliver products and services. The magnitude and duration of the resulting impact are then used to identify prioritized activities which are assigned predetermined timelines related to restoration of availability, performance, and security to predetermined acceptable levels by the business, users, and other stakeholders.
The timelines are then used as inputs to inform which resources are required to support these prioritized activities.
(Related reading: IT monitoring overview & how Splunk helps you understand business impacts.)
Generally, there are three time-based indicators that are generated when a BIA exercise is carried out — RTO, TPO, and MAO — which are defined as follows:
RTO is the period of time following an incident within which a product and service or an activity is resumed, or resources are recovered. For example:
A core banking system may have a target RTO of 5 minutes.
A single office printer may have an RTO of several days.
RPO is the point to which information used by an activity is restored to enable the activity to operate on resumption. RPO can also be referred to as “maximum data loss”.
The RPO is usually computed based on the time the last backup was maintained. For example:
A cloud-based CRM may have an RPO of less than 1 minute if there is near-real time replication of data.
A data center server whose backup is carried out daily at midnight has an RPO of 24 hours.
MAO is the time it would take for adverse impacts — which can arise as a result of not providing a product/service or performing an activity — to become unacceptable. Can also be referred to as “maximum tolerable period of disruption” (MTPD).
The MAO is usually higher than the RTO since it includes the coverage of the time it takes to detect the incident and then recover from the impacts encountered.
The ISO/TS 22317:2021 guidelines for business impact assessment outline the following eight steps for conducting a BIA:
During planning, the BIA lead will allocate resources and responsibilities required for the BIA process. This includes:
Grouping services and systems.
Identifying the relevant subject matter experts who can provide the necessary information related to impacts and required recovery timelines.
Planning requires the support of the organization’s leadership and should also result in formal communication of the schedule for the rest of the process activities. Planning will also prepare the templates to be used for documenting the BIA outputs.
If the organization has never conducted a BIA, then this step involves specifying how the impact assessment will take place. The approach considers four key dimensions:
Understanding impacts. Impacts refer to the result of the disruption on the organization. Depending on the operating context, the BIA participants can group impacts into types based on their source and area of coverage to facilitate easier assessment of the effect on products, services and business processes. Examples of types include business objectives, financial, legal, regulatory or contractual, or reputational.
Determining impact criteria. Here the degree of impact is agreed and mapped to a scale that can be applied for analysis purposes. For example, a financial impact criterion can have levels 1-5, with 1 mapped to below $1,000, while 5 can be above $10,000,000.
Determining time frames. To analyze impacts over a period of time, the BIA participants will also choose scales for time frames. For example, there can be five scales, ranging from:
1 being below 10 minutes
5 being beyond 2 weeks
Define methodology. The methodology brings the previous three dimensions into a coherent procedure that ensures that different BIA participants can come to a similar decision based on their understanding of impacts and timeframes.
The methodology will also define techniques to be used for the rest of the steps in the BIA process.
Decisions on priorities are made by the organization’s leadership in consultation with subject matter experts. Information on product and service priorities can come from various sources, such as:
Customers
Regulators
Suppliers
Other stakeholders.
In addition, risk assessment activities and lessons learnt from previous disruptions are key inputs in determining priorities. The leadership is best suited to make decisions on priorities due to their position which allows a wider view of the entire operational landscape, and knowledge of the future direction of the organization from a strategic perspective.
In this phase, you’ll assign priority scales to products and services. For example, an AI platform or e-Commerce website that generates the highest revenue is given the highest priority.
Once you’ve prioritized the products and services, the underlying business processes that result in their creation, delivery and maintenance are identified, analyzed, and prioritized based on similar scales. Then information on activity priorities is usually provided by the process practitioners who understand what is directly involved to make the products and services work effectively.
At this point, RTO and MAO indicators are assigned to these activities, based on the priority scales. Activities are analyzed based on:
Demand fluctuations
Stakeholder requirements
Dependencies
Once the process activities are determined and prioritized, the BIA participants will then identify resources that are required to carry out the activities effectively. These include people, equipment, suppliers, information, facilities, logistics and others.
Resources and their dependencies are analyzed from the basis of capacity levels required for meeting both RTO and MAO, since some activities can be easily resumed with minimal levels of resources.
It is also during the analysis of information and data requirements that the RPO is determined.
This stage also identifies single points of failures — resources that do not have an alternative, and their absence or failure results in a significant impairment of a process. For example, the organization might depend on a single supplier to process credit card payments made on their customer app.
(Related reading: IT failure metrics.)
Here, the information gathered on the product and services, activities and resources are brought together, analyzed, and results generated. Appropriate quantitative and qualitative analysis techniques are applied based on the chosen methodology to collate the results, and a formal review conducted to ensure that the results are correct, credible, consistent, current, and complete.
A key element to be considered during the analysis activity is dependencies, as setting the RTO for dependent activities requires foresight to prevent subordinate activities having a higher RTO than the processes they are meant to support.
The summary of the BIA is then shared to the organization’s leadership for review and endorsement. The leadership’s commitment is required to ensure that they will be held accountable to:
Provide the requisite resources to meet the recovery targets.
Make informed decisions pertinent to recovery.
Because we live in a VUCA (volatility, uncertainty, complexity, and ambiguity) world where change is the only constant, the BIA process should never be treated as a one-time activity. Instead, organizations should ensure that BIA activities are regularly carried out, at least every year.
Information from strategic changes, technology evolution, and legal and contractual requirements should be provided to update the BIA.
In addition, the results from continuity tests and actual invocations, as well as results from risk assessment and treatment should also be considered as input in order to keep the BIA fit for use.
When it comes to determining recovery timeframes, organizations should be wary of generalizing RTO and RPO across all IT systems. While stakeholders might argue that all systems are important and should be treated equally, this approach may result in inefficiency in cost and resource allocation.
Instead, it is prudent to set these targets on a system-by-system basis that considers both cost and complexity requirements. Participants in the BIA process should take advantage of resources from third party vendors to determine RPO and RTO. (The AWS Resilience Hub is one example.)
The quality of the BIA process and its outcomes is key to selecting appropriate business continuity strategies and solutions.
But quality should also be balanced with time taken to conduct this process, since the BIA process is an estimation activity and if the organization takes too long trying to make the process perfect, this can take away the time for developing and implementing the business recovery strategies.
See an error or have a suggestion? Please let us know by emailing ssg-blogs@splunk.com.
This posting does not necessarily represent Splunk's position, strategies or opinion.
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.