A consistent challenge we face when supporting clients is the perceived complexity of Maintenance and Work Management KPIs. This is an issue that will be familiar to many organisations wherein those who live and breathe KPIs find them a doddle - easy and straightforward while those who use them less often find them murder - cumbersome and impenetrable to the point of uselessness.
This means that whatever insight KPIs can provide is lost to many and the recording of them becomes for the most part redundant.
Given this we thought it would be useful to provide a brief overview of two of the most common industry KPIs and their use to hopefully allow anyone within an organisation to better understand their company’s work management performance.
Most Work Management KPIs are designed to measure and/or ensure three things:
- Risk is being managed effectively.
- Work is completed before negative consequence occurs
- Work is carried out efficiently
Maintenance work falls into two basic categories:
Preventative Maintenance and Defect Maintenance.
Preventative Maintenance is work which is carried out as part of a pre-determined strategy to avoid a piece of equipment failing.
Defect Maintenance is maintenance to fix something that is broken.
Both maintenance types will have an associated date that they must be executed by before the likelihood of negative consequences becomes too high to accept.
An everyday example of how the date is determined for Preventative Maintenance is sitting in your driveway. Your car has a service manual which sets out how often the brake fluid should be changed. Brake fluid can absorb moisture over time which can lead to it losing effectiveness and cause corrosion of the system. The service manual recommends that the fluid should be changed after 4 years. Therefore, the Preventative Maintenance date for renewing the brake fluid should fall on a date 4 years from the last change out. Of course, the negative consequence this is preventing is the brakes on the car failing when you step on the pedal.
As an example of how the date is determined for Defect Maintenance, you notice the tread on your car's tyres is getting low so you will need to change them out soon. You estimate that you have 3 weeks to do so before reaching the legal tread depth limit. Therefore, the Defect Maintenance date in this instance is 3 weeks from the date you noticed the tread getting low. The negative consequence this is preventing is the tyres on the car not having enough tread to control the car.
As these examples illustrate the dates defined using these two approaches are set in order to ensure the necessary work is carried out before any negative consequence happens. For the purposes of this article we’ll call this the Risk Date (RD).
In the above examples the necessary work to the cars should be scheduled to be carried out ahead of the Risk Date. Let’s call this the Scheduled Finish Date (SFD).
These two dates are important because they are the basis for the 2 fundamental KPIs being covered here, which are:
- Volume of Deferred Work
Whilst there are many other KPIs, understanding these 2 can give considerable insight into an organisation’s work management performance.
The challenges around the definition of “Backlog” is a topic worthy of another article so I won’t go into that in detail here. My goal is to explain the information that your organisation must capture in order to effectively understand work management performance and risk.
For the purpose of this section let’s consider –
Backlog = All work that has not been executed by its Risk Date.
Backlog then, using this definition, represents the amount of un-assessed or un- endorsed risk that exists within a business. The volume of this work can be measured by count of activity - giving an indication of how many risks exist in the business, or by count of hours associated with the work’s execution - giving an indication of the level of effort required to eliminate the risk. Either can be used effectively but there is obviously value in measuring both.
The understanding of this has been diluted across many industries over the years, with many now thinking that backlog is simply an indication of inefficiency. Whilst partly true, viewing inefficiency as the only issue backlog represents has led to behaviours that are extremely damaging. How many times have you heard people say -
‘work should be left in backlog to make it visible?!’
It is true, leaving work in backlog does continue to highlight inefficiency but, more crucially, it means that businesses are operating with unmanaged and unknown risk. This, in turn, means that the very negative consequences which the maintenance is there to prevent may occur at any time. This is extremely detrimental to any business as it greatly increases uncertainty and reactivity.
To understand what risk exists within your organisation it’s important to understand how backlog is treated, and what the mechanism is for managing the risk associated with that backlog. Using the method described above, in order to reduce the uncertainty as much as possible and move the risk from unknown to understood, all work in backlog should be reassessed to determine the consequence and likelihood of that consequence occurring, and a revised Risk Date set. This means the business can once again see the risk that it is carrying and understanding when the negative consequence is likely to occur.
The mechanism for carrying out this reassessment is typically referred to as a deferral, which leads to the next KPI…
VOLUME OF DEFERRED WORK
A deferral is the mechanism whereby a piece of work has its Risk Date updated by reassessing the consequence of the work not being complete and the likelihood of this consequence occurring once mitigation has been included.
The Volume of Deferred work is the amount of work that has had its Risk Date updated and moved beyond its originally assigned Risk Date.
There are 2 main reasons for measuring this activity.
Firstly, it measures how much work could not be completed by the originally assigned Risk Date, which itself gives an indication of how well risk is being managed. Whilst not true in all cases, deferring a risk generally increases the probability of the negative consequence occurring ahead of its Risk Date, simply because of the additional time. It also gives an indication of how much risk is being mitigated or endorsed, as opposed to eliminated by executing the maintenance work.
Secondly it gives an indication of execution efficiency and/or capability. This is the aspect that leaving things in backlog “makes visible”. By counting the volume of deferred work, a business is able to capture the vital information regarding execution efficiency and/or capability without the negative impact of carrying unknown risk.
Whilst a deferral mechanism is most commonly used for Safety Critical work in the Oil and Gas industry, it can and should also be used for Business Critical work. The control of safety critical backlog is one of the key mechanisms used to ensure that major accidents do not occur. Similarly, the control of business critical backlog should be a mechanism used to ensure that production efficiency is not impacted by unplanned losses.
Deferred work has developed a negative reputation as it has in the past been abused and used as a mechanism for moving work from backlog into forelog without the necessary risk assessment and review described above. This has been done with both good and bad intentions in the past.
The bad intentions of a maintenance manger signing off on moving the dates on thousands of hours of backlog on the last working day of the year with a single deferral in order to meet their annual target.
And the good intentions of having all work on specific systems or equipment moved out to a date when the equipment is likely to be taken out of service, or alternatively moving the date to an arbitrary one in the future and then ‘brought back into the schedule’ once it is understood when it can be executed. It’s worth noting however that, had the risk been assessed and understood, the former approach described above would be acceptable as it would mean the risk is assessed or at least endorsed. It is the lack of review that is egregious.
Whilst the intentions of these examples are obviously very different, the negative consequence is essentially the same. The Risk Date of the work has been moved to a date not based on risk but that is picked for convenience. This means that the business becomes truly blind to the risk that it is carrying and the likelihood of that risk occurring.
Thankfully there are simple ways for leaders and organisations to prevent this abuse of the process.
- Ask how the deferral process within their organisation is implemented. It should be based on the principles above, and key people within the approval process should understand this. If neither of these things are true, then further investigation would be appropriate.
- Ensure there is an audit mechanism in place whereby there is an independent assessment of deferrals being carried out. This is best approached by a simple random sample of deferrals with a narrative provided on their quality.
- If a backlog reduces suddenly, ask for an explanation of how it was achieved. Whilst it can be tempting to simply celebrate the fact it has reduced, positively recognising this drop without the appropriate understanding can lead to the reinforcement of hugely detrimental behaviour.
Once the mechanisms above are understood, using the 2 KPIs together makes interpreting them at a high level for your organisation fairly straightforward.
A high level of backlog with a low level of deferred work means you have a high level of unknown risk and your deferral process is not being used sufficiently to manage this risk.
A high level of backlog with a high level of deferred work means you have a high level of unknown risk and your deferral process is being used to manage this but is unable to cope with the volume.
A low level of backlog and a low level of deferred work means that your organisation looks to be executing it’s work ahead of the described Risk Date.
A low level of backlog and a high level of deferred work means your organisation is managing its risk using the deferral process as opposed to eliminating the risk by executing the work.
Whilst these are obviously simplified interpretations and most would benefit from a few more questions to support their conclusion, even at this simplified level they give passable rules of thumb.
APPLICABILITY IN THE FACE OF VARIANCE
As alluded to at the start of the article, there are, unfortunately, several interpretations of the definition of the term backlog across the industry. Should you work in an organisation that interprets backlog slightly different to that described above, the insights can still be gleaned provided you ask the right questions to establish the concepts described.
These questions are:
- How is the Risk Date determined?
- What happens if work is not executed by this Risk Date?
- If the date is changed, how do I see how much work has had its date changed?
- If the date is not changed, how do I see what risk this represents and when that risk will occur?
Asking these simple questions and paying close attention to the answers should lead to some interesting insights.