Print

LivingRCM

Optimal Maintenance Decisions Inc.

Asset Reliability

Living RCM

Slide 1

<table class="wikitable"><tr><td class="wikicell" >

You’ll notice from the title that I’m attempting to fuse the three dominent initiatives in maintenance. CBM (or PdM or CM), the CMMS, and the RCM thought process. The slide lists our objectives for this talk.
<p>
The CBM programs and techniques displayed in the exposition hall and described during the sessions, are indeed impressive. Nevertheless, if one steps back from the glitter of technology, one might admit that ample data interpretation challenges like paper help and opportunites still lie ahead. Without, I hope, sounding pretentious, I’ll suggest ways, in the next 45 minutes, by which we can enhance our CBM effectiveness.
<p>
Secondly, I hope to convey to you the advantages of "living RCM".
<p>
Finally, we'll discuss how to endow your CBM programs with an inherent improvement process.
<p>
These are ambitious goals for 45 minutes. So let’s begin.

</td><td class="wikicell" ><span class="img"><img alt="" src="http://www.omdec.com/wikiimages/livingRcm/Slide1.jpg" border="0" /></span></td></tr></table>

Slide 2

All CBM relies heavily on software. Software, at our bidding, performs functions.

Secondly, but less obviously, each software product embodies its own culture or point of view. We, as users, need to embrace that viewpoint in order to benefit fully from the application.



Slide 3

CBM optimization software assists us in building CBM decision models. A decision model is a kind of measuring stick that we can apply to our monitored data. Using the model we interpret that data 'optimally'.

What do we mean by "optimal" in CBM? When we get CBM data, it’s indications are seldom "black and white". Torrents of data from proliferating sensors and process control systems tend to overwhelm.

Often our decisions are judgment calls. Judgment can span a spectrum of choices from extreme caution (acting too soon) to lethargy (waiting till it's too late). Making the best choice (given the current operating context) is what we mean by optimal.

I’ve heard the word optimal being used at this conference. For example “optimal reliability”. That’s not clear. We should, rather say, an 'optimal decision process' that achieves a certain objective, for example maximum availability or a specified survival probability in a time frame. Or, in a "cost constrained" operation (operating below market capacity), perhaps the optimizing objective is minimal cost. Or, optimization may seek to achieve a specified combination of one or more of the above.

A model is a procedure used to interpret data in order to achieve a long run objective. It may be a set of rules enshrined in an expert system. Or it might be a statistical model, or perhaps just someone’s gut feel. Even that’s a model, not one I would care to admit to my boss or to my company’s shareholders

Secondly, through software, we deploy that model in an automated way. I subscribe to the school of thought that unless CBM can be mostly automated using rules or algorithms, it will return limited benefits. There is too much data, and there are too few human resources to pronounce upon all that data.

Thirdly, the whole CBM exercise, is of little value unless we can measure its effects objectively, and improve it by tweaking our models. This third function is really what defines a maintenance and reliability engineer, in my view.



Slide 4

The viewpoint of CBM optimizing software is expressed in three premises.

Firstly, that information on what happened actually has value. I’m talking about historical data in the CMMS. How many of us have had the thought ourselves or have heard the statement by technicians, planners, and consultants, that the information in the CMMS is being under-utilized? Some consultants even claim that it is useless. To the contrary, historical information in the CMMS is vitally important. The major failing of the CMMS industry is not to have provided tools and methods for harvesting knowledge from that rich information source.

Secondly, if we accept Premise 1, we may ask how may we convert that information to practical knowledge. What is knowledge in maintenance? It is the ability to make the right decisions, given the current data.

Premise 3 says, that we need to improve our (decision enabling) knowledge base continually.



Slide 5

Here is a question that all of us have asked at one time or another. What information do we really need to include on the work order for purposes of subsequent reliability analysis?

What is reliability analysis (RA)? How does it differ from reliability-centered maintenance analysis (RCMA)?

RA is the study of failure modes that have occurred. And RCMA is the study of the failure modes that 'can' reasonably occur. Both those analyses populate the same knowledge base. They complement each other. RA sanity checks and refines the assumptions and approximations of RCMA, while RCMA provides the structured framework for a consistent language and thought process.

Animation 5-1
I am asking you to consider adding two tiny pieces of information to your work orders that can make all the difference in the world.

The first is the RCM reference number. This is a number that refers to a record in the RCM knowledge base. It is a one-to-many relationship. This relationship has powerful implications for reliability improvement. Linking of a significant work order to the RCM table means that the work order becomes an instance of a RCM knowledge record. And if that is true, then we can count the occurrences of significant item-function-failure-causes. Counting these instances (in various ways) is precisely the function of reliability analysis software and methods, examples of which are: Pareto, Jack-knife, Weibull, proportional hazard modelling, and many others.

I have visited many plant sites, have spoken with their maintenance and reliability analysts. All of them have purchased and attempted to use reliability analysis software. But they ran into a problem. They didn’t have the right data in the proper form to feed the otherwise very powerful software that they had puchased.

By adding these two pieces of information, we solve that problem.

What about that second suggested piece of information "Event Type"

There are, mainly, three event types: PF, FF, and S. Was this a functional failure (FF) that had consequences? Or, was it a potential failure (PF) that we caught on time to prevent the most dire consequences of the failure mode. Or, was this a suspension (S). A suspension is the renewal or replacement of a component for any reason other than failure, either potential or functional. The planner or technician may have found it expedient to replace the part even though it was not in a failed state nor showing any signs of impending failure.



Slide 6

Now those of you familiar with RCM are going to ask, how in the world is a technician or planner ever going to locate the appropriate RCM knowledge record from among, literally, tens of thousands in the knowledge base?

Animation 6-1
We search for it. How does modern man or woman search for information today from unimaginably huge knowledge bases? With a search engine. We enter a few key words. We might get 50 or 1000 hits. But the one we're looking for is usually close to the top of the list.

Animation 6-2

We can check by hitting the offered link. And we bring up the associated knowledge document. The technician or planner reads the failure mode and the effects. And, determines whether this knowledge document describes the current situation. It does. So we grab the RCM reference number and enter it onto our work order. We have established the link.

What if we find some errors or would like to add some clarifying details, say to the Effects field? We edit the knowledge record. But wait! Can we allow anyone to modify the knowledge base? Yes, we can, because the edits are being tracked, just as in Word. The new or modified knowledge record is automatically routed to a verifier – a RCM facilitator if you will. A discussion could ensue. Eventually the knowledge base is updated with verified knowledge.

However, we don’t throw the edit tracked document away. It persists as a permanent audit trail of our evolving understanding of the failure behavior of our significant assets.

What if we can’t find a knowledge record in the RCM knowledge base that represents the current situation? We perform a mini-RCM analysis on the spot. What better time? The equipment has been opened. The situation (the Effects and Consequences) is fresh in the minds of all concerned.



Slide 7

Now that we’ve lined up our knowledge, our instances of that knowledge - the work orders of the CMMS, we can without any additional bother, perform a variety of valuable reliability analyses.

We accomplish that via a bridge between the maintenance database and our reliability software applications. We call this bridge the "Events table". I call your attention to the fourth column of this table, the Event. Look carefully at the syntax of that event code. It begins with a letter, either E (for "ending") or B (for "beginning"). Then a number, which is none other than our RCM reference. And finally a suffix: FF, PF, or S.

Hence, in one succinct code we have captured precisely what happened. We see that a work order, say work order 9, expands into two events in the Events table. The ending of the life cycle of component 20890, and, the beginning of the life cycle of the renewed or replaced component. This is the form of data that reliability analysis software needs. The Events table was generated automatically from the work orders endowed with those two additional, simple, yet potent, pieces of information that we recommended in Slide 5.

Animation 7-1
Here we see three more tables. The table on the bottom left, you will recognize as the RCM knowledge base. The other two tables represent a process known as data mapping. In the top table we create, by adding a record, a failure model. A failure model is a group of RCM records that we have mapped to it. The failure model, defines the the study that we would like to perform.

Let’s say, for example, we would like to study the behavior of all the grease bearings on a particular class of ship. We create the Failure Model “Grease bearings”. And select all the RCM records whose instances we would like to include in our study. The results of that mapping are shown in the table at the right. Notice on the right, that we have mapped 11 RCM records to the failure model "Degaste general", or “general wear”.

Using this simple data mapping technique we can study any single or any group of failure modes.

Animation 7-2
With our failure models set up, we can begin to perform reliability analyses.

We can plot failure rate graphs (top right). And because, you will recall, we have discriminated between functional and potential failure, we can study the behavior of either. That study will tell us to what extent our PFs are pre-empting our FFs, which, of course, is the aim of CBM.

Animation 7-3

We determine whether we have quality problems of some type as evidenced by the infant mortality behavior of this failure (hazard) rate graph.



Slide 8

We are not limited to Weibull analyses, such as those of the previous slide. We may peform Pareto analyses of all kinds. Here is one which shows the relative frequencies of five failure modes (or failure models) in a Cat 240 ton truck engine.

We can hit a tab and get the down-time comparisons, or the cost comparisions, or the availability comparisions. There is no limit to the breadth and depth of studies that we can perform, with little effort, provided that we have captured those two pieces of information mentioned earlier.



Slide 9

A Jack-knife analysis (developed by Peter Knights) is an extended Pareto analysis. It shows two dimensions of the analysis simultaneously. Once again we are focusing on the Cat 240 T truck engine.

The verticle axis measures the actual gravity of a failure mode. In this case it is measuring downtime. The horizontal axis measures frequency.

The graph has been sectioned into four quadrants. The dots represent the failure modes. Those which fall in the upper right quadrant require our most urgent managerial attention. They are both acute and chronic. This graphical representation tracks the results of our policy changes. As we modify our policies, we should see the failure mode dots migrating eventually towards the origin. Hence we have a clear, quantitative measure of our progress towards ultimate reliability.



Slide 10

Here we have a form of reliability analysis that extends Weibull analysis to include relevant factors beyond simply the age factor. It is based on what is known as a “proportional hazard model”. While Weibull analysis models the relationship between survival probability and working age, proportional hazard modeling adds other dimensions to the analysis – namely condition monitoring data.

The red and green graph is a visual representation of a decision model. The verticle axis represents the weighted sum of all those variables that have been found (by the RA) to be significant to the particular failure mode (or failure model) being studied.

The decision model answers the question, posed thousands of times by maintenance managers. "How do we set our CBM limits?"



Slide 11

Finally, recalling our objective of slide 1, we want to measure the performance of, so that we can continuously improve, our CBM program.

Here is that Weibull hazard rate graph again. We notice that the Total failures curve and the Potential failures curve are very close to one another. The difference between the two, is the Functional failures curve. It is low and flat, which is the signature, of a well-maintained component.

Animation 11-2
How do we measure how 'predictive' our predictive maintenance program is? Here is one way. This is a histogram of 670 CBM inspections. Each inspection reported a remaining useful life estimate (RULE). The difference between the estimate and the actual failure time (normalized to actual failure time) is plotted on the verticle bar chart. We see that 412 CBM estimates were within 10%. 176 were as much as 20% off. And 20 out of the 670 were as much as 80% off. But the point is that we have a way to determine how well our algorithm for remaining useful life is performing.

On the top right we have our CBM key performance indicators. They tell us, in bottom line financial/managerial terms how well or poorly the current CBM policy meets the objectives of our CBM policy.



Slide 12

The slide illustrates the simple workflow of creating or updating a knowledge document. All modifications are edit tracked (time/date/author stamped as in MS Word) and persist as a permanent audit trail of the evolving knowledge base


Slide 13

The benefits of the rational use of information in maintenance begin to flow almost immediately, once RCM thinking is used when closing work orders. Every CMMS, EAM, or paper based work order system can be easily adapted to record the RCM knowledge reference and the Event Type.


NEXT: REWOP Tutorial, HOME: Index


Created by: Murray Wiseman. Last Modification: Monday 17 of May, 2010 12:11:18 MDT by mercury.

Features

Quick Edit a Wiki Page

Menu

Powered by Tikiwiki Powered by PHP Powered by Smarty Powered by ADOdb Made with CSS Powered by RDF
RSS feed Wiki RSS feed Blogs