Operational Resilience: The Definitive Guide & Step-by-Step Framework AU

One of the key concepts getting serious airplay on the current risk management stage is operational resilience. It is a key focus of global financial services regulators, led by the UK and the Bank of England, and an increasing focus of governments worldwide with respect to critical infrastructure. What is operational resilience and how should you go about ensuring your operational resilience capability is up to scratch?

What is operational resilience?

Resilience derives from the Latin word “resilire” which means to recoil or rebound. The Oxford English Dictionary defines resilience as “the capacity to recover quickly from difficulties; toughness”. While the OED definition focuses on the ability to recover quickly, resilience is also the ability to withstand adversity in the first place.

Operational resilience for organisations therefore means:

To withstand adversity. This covers two elements:
1. To position the organization to not be affected by disruptive events, e.g. locate in a non-earthquake zone
2. Be able to withstand a disruptive event so that damage is minimised, e.g. using an earthquake proof building if you have to be in an earthquake zone
To be able to recover quickly and successfully from adversity
To be able to pivot the organization quickly and permanently if the adversity or disruptive events create a new normal
To be able to learn from the disruptive events and make the organization stronger

Operational resilience is not new. There are many aspects of risk management that focus on resilience, particularly Business Continuity and Disaster Recovery. These disciplines play a major role in operational resilience but on their own they are not all of operational resilience. Operational resilience comprises of a range of activities, processes and capabilities.

What should an operational resilience process look like?

So, moving from the why and what to the how, what does a strong operational resilience capability look like and where do you start?

1. Stakeholders and objectives

The starting point is to define what are your organisation’s ultimate outputs that need to be resilient. This will be defined by identifying your key stakeholders and the value and service you bring to them. This provides the ultimate objectives of operational resilience.

Examples may be:

To provide payment services to customers to allow customers to buy, sell, borrow and invest
To provide stability of the banking system
To provide utility services such as electricity, gas or water
To provide telecoms services

2. Important Business Services

The second step is to identify the Important Business Services (IBS) that are required to deliver your key services to your stakeholders. This will be an end-to-end process. We need to map these processes so that we understand the sub processes and critical resources needed for their successful operation. For example, for a payments provider, this IBS will be the complete end-to-end payments process.

3. Impact tolerances

You need to set impact tolerances over the negative impacts you may bring to the key stakeholders. We need to determine how the objectives noted in the first stem can be measured. For payment services, the ultimate impact on the customer may be such things as financial hardship or quality of life. A measurable proxy for this may be the level of availability of the service.

Once the objectives measurement has been determined, maximum impact tolerances need to be set at a point where if they are exceeded, an unacceptable level of damage occurs to the stakeholder. For the payments service, this may be the maximum acceptable outage.

4. Sub-processes

The next step is to identify the various sub-processes that make up the IBS. For the payments service, a sub-process could be Merchant Switching.

5. Critical resources

For each sub-process we then need to identify and map the critical resources (e.g. People, Physical Assets, Technology Asset) required to operate that process, and therefore the associated Important Business Service.

6. Resource health

Once the critical resources are known, we need to be able to assess the health of each resource in terms of its ability to withstand stress (prevention) and also the ability to recover from stress (cure).

At this stage we have a complete understanding of the end-to-end process which needs to be operating effectively in order to deliver the identified objectives.

7. Scenarios and simulations

We then need to consider a range of severe but plausible, disruptive scenarios. These may include such things as natural disasters, pandemics, social unrest, conflict, terrorism etc.

We then need to understand what will happen to our IBS if the identified disruptive scenarios were to occur. This requires running simulations, usually by way of desktop simulations, of the selected scenarios and assessing the impact on the IBS, the objectives and ultimately on your stakeholders. The test results then need to be evaluated against the Impact Tolerances.

8. Learnings and resilience improvements

Where the scenarios are outside of tolerance, we need to analyse and identify where there are weaknesses, vulnerabilities, single points of failure etc. Any issues should then be identified leading to actions to resolve.

9. Reporting and Accountability

Finally, we need to consider the information that needs to be produced and reported around the above steps. This will include external reporting to regulators and other third parties and internally to Board, executive management and other interested parties.

About this series

This article is a first of a series to be published over the coming months which will dive deeper into the elements above. We will add links to this article as new articles in the series are published:

What is operational resilience? [this blog]
What are your Important Business Services?
Designing your impact tolerances
Mapping your Important Business Services
Design and running of a scenario
Identification of weaknesses and actions in your operational resilience
What reporting do management want to see?
Designing a good self-assessment process

Protecht's Complete Guide to Achieving Operational Resilience eBook gives you a detailed look at Operational Resilience, to learn exactly what makes it different from Disaster Recovery and Business Continuity and to get a list of steps to help you develop your own Operational Resilience capability. Find out more and download it now.

Back to list

Product

Solutions

Capabilities

Useful information

Industries

Industries

Useful information

Knowledge hub

Knowledge hub

Useful information

Operational Resilience Series #1: What is operational resilience?

What is operational resilience?

What should an operational resilience process look like?

1. Stakeholders and objectives

2. Important Business Services

3. Impact tolerances

4. Sub-processes

5. Critical resources

6. Resource health

7. Scenarios and simulations

8. Learnings and resilience improvements

9. Reporting and Accountability

About this series

Is Santa operationally resilient?

CPS 230: Bringing resilience to life through scenarios webinar Q&A.

OpRes vs BC vs DR: What’s the difference?