Interview with Murray Wiseman

by John Little, Consultant, Dynamic Maintenance Safety and Risk Assessment, jelittle@oricom.ca

Q: Murray, I want a better feel for the ties between a CMMS, RCM and plant/process maintenance and related safety. I am, however , very curious as to the actual methods and practices used to assess failure modes and their consequences. Is this a team based exercise using vendors, design engineers, process engineers, IT(DCS) experts, process control specialists, safety specialists, users, operators, maintenance supervisors-managers-technicians, etc?

A: Yes it's a team activity. The specialists you mention are called in as required. The core team includes experienced maintenance and operating persons and first line supervisors and a trained RCM facilitator who is selected from the organization - not a consultant. The facilitator, as his main responsibility, makes sure that each RCM analysis benefits from all available knowledge – especially that which is locked in the memories of experienced personnel. Safety and environmental considerations come into play throughout the entire process. In the functional analysis, the team thoroughly exposes the hidden functions of protective devices as well as the structural integrity, containment, and control functions. The effects analysis describes the hypothetical scenario of events that may be touched off by, and lead up to, each failure cause. In the consequences analysis, hidden, safety, and environmental consequences are considered before operational and non-operational consequences. In the task analysis, safety procedures are explicitly stipulated - although their development is outside the scope of RCM.

Q: Do they look at a piece of equipment/process and break it down into operating tasks, maintenance diagnostic tasks, maintenance tasks, etc. to spot the task related hazards and risks to the public, plant, personnel, and maintenance workers?
A: Yes, however the safety aspects of the maintenance tasks as well as the detailed specification of the tasks themselves are analyzed outside of RCM.

Q: Do they just look at 'Critical equipment' and 'critical components' on a 'What-if they fail' basis?
A: Siginificant items - RCM targets "items" whose failures have hidden, safety, environmental, or serious economic consequences

Q: Regarding CMMS software, the feedback I get from most maintenance experts/consultants is that each potential application must be assessed individually to determine what or which commercial CMMS software package best fits the client's needs. What are your thoughts on this?
A: To a minor extent, yes. But management consultants have not yet seized upon a very important required function of a CMMS - to capture and return knowledge for the benefit of improving reliability, safety, and environmental integrity at lowest cost.

Q: In particular, is there an industry standard or methodology available to assess the client's needs and identify/match the best commercial CMMS software package to meet those needs?
A: Yes. I would say RCM is that standard. RCM identifies the objectives of maintenance for each significant asset. It will soon become very clear to an organization that has embarked upon RCM, what systems they require to manage the maintenance plan. In my opinion all CMMSs are fairly similar, the differences among them are not earth shattering.

Q: Are you unbiased in your views on this or do you represent certain(any) software vendors?
A: My bias is that I am partial to the RCM and EXAKT philosophies that say that all recorded maintenance information must target specific analysis objectives - reliabiltiy, availabiltiy, maintainabilty, safety, environmental integrity at lowest cost.

Q: Doesn't RCM, as developed by Nowlan and Heap and later John Moubray propose that all possible failure modes be analyzed, regardless of their relative severity?
A: This critique is often made, but it could not be farther from the truth. Neither N & H nor Moubray ever made this statement. Far from analyzing "all" failure modes, Moubray uses the expression "all reasonably likely" failure modes. He goes on to elaborate a process for determining those that are "reasonably likely" given the asset's operating context. The more severe the failure consequences - the more failure modes we consider (as being reasonably likely). In addition to a well defined methodology for determining how many failure modes to consider, Moubray also provides the best advice I have seen on determining the number of times to ask "why" in order to arrive at the appropriate depth of causality (see his section on "Causation" on page 65 of his book) of each failure mode.

Q: What about well established analysis techniques such as root cause failure analysis, fault tree analysis?
A: RCM does not preclude the use of FTA, RCFA, Pareto charts, Scatter plots, and or any other tools and data sources that bear upon the discovery of the asset's failure behavior and consequences. FTA can be of great value in the failure modes analysis for complex systems. And RCFA is definitely a sub-process of failure modes analysis. Nevertheless, one often finds that many valuable "secrets", can be unlocked quickly and efficiently, simply by asking the RCM questions to experienced operators and technicians within a facilitated and well structured RCM forum. The call, as to which additional tools to bring into the analysis, is usually made by the RCM facilitator who has his eye both on the clock and on the necessity to guard against superficial analysis.

Q: Has Moubray captured all of Nowlan and Heap's methodologies?
A: I'm glad you asked that. The RCM methodology discovered by N & H rests on three pillars: 1. Initial information gathering, 2. A decision process, and 3. Continuous analysis or "age exploration". The third leg of RCM is the continuous improvement cycle. Although Moubray discusses "living RCM" in his book, he does not emphasize its importance nor elaborate a process, other than a "review" to update the initial analysis every 9 months. N & H, on the other hand, were aware of and recognized, even in 1978, the huge future potential of relational databases, and made it a point to describe their role in the continuous enrichment of the RCM knowledge base. I believe that the CMMS has been undervalued and underused in this regard.

Q: Why not develop safe work procedures simultaneously with the failure mode and consequence analysis activities in order to reduce time lags, duplication of effort and key information transfer errors/loss to the operators, maintenance technicians and supervisors,process, safety and other specialists who I presume are developing these safe work procedures.
A: The RCM process tends to follow the sequence: 1) functional analysis, 2) failure analysis, 3)cause analysis, 4) effects analysis, 5) consequence analysis, and 6) task specification. Please note that all safety issues associated with functional failure of the asset, including those caused by human error, are thoroughly analyzed in the FMEA and consequence portions of the RCM analysis. It is only during task specification, however, that safety procedures associated with maintenance tasks are invoked. Why don't we talk about maintenance task safety during questions 1 to 5? Because we haven't yet determined the tasks required to acceptably mitigate the consequences of the failure? Why not design the safety procedures during the task specification? Because it would take too long and bog down the RCMA which should proceed at a rate of about six failure modes per hour. In any event, it is far more efficient (from a human process point of view) to consider issues, such as detailed design modifications or detailed safety procedures, in another meeting designed specifically for that purpose and having the necessary safety and design experts on hand. So the general answer to your question is that it is simply more expedient to get one job done - i.e. RCM where tasks, safety procedures, and design changes are specifed. And then get on with other necessary follow-on jobs (i.e. details of redesign, detailed maintenance safety procedures, detailed work planning procedures, scheduling, resourcing, and so on.)

Q: Dynamic Maintenance Safety is based on the premise that line employees cannot remember every single safety rule, good practice, standard operating procedure, etc., that they learned ages ago in orientation and training. The purpose of Task Based Hazard Analysis is to identify only those hazards that are relevant to each step in the task to be performed. It ascertains that no 'surprise hazards' go undetected at the time a task performed and that the procedures to follow in performing the task are current, relevant, safe and efficient( a last minute check before "Takeoff"!). Should not this process be performed within the RCM team meetings?
A: RCM defines what has to be done to preserve physical asset function. Physical assets have, among others, protective, control, containment, and structural integrity functions that need to be maintained. RCM discovers what those functions are, how they can fail, what happens when they fail, and how it matters. RCM then guides the selection of the appropriate task, design change, or procedural change to manage the consequences of the failure. With the growth of mechanization and automation, most safety, environment, product quality, and asset reliability issues are associated with the failure of some function of some electro-mechanical physical asset. Therefore, properly analyzing and maintaining function in the asset's current operating context goes a long way to preserving life and the environment. RCM, however, does not preclude Dynamic Safety Maintenance, but complements and assists it, by having demystified the complex functional requirements of the asset into highly readable form. Both processes will gain by the successful application of the other.

One further point in this regard. The initial RCM analysis will not, by any means, be perfect. It will not uncover every likely failure mode, nor will it anticipate all important failure effects of each failure mode that it does initially expose. A "living RCM" process must repeadedly monitor and analyze maintenance results, and question and test the assumptions made initially (with the information available at the time). I would consider Dynamic Safety Maintenance a valuable and necessary part of the living RCM process.