PART 1
The Alarm Management Problem
CHAPTER 1
Meet Alarm Management
If you need a new machine and donāt buy it, you pay for it anyway but never get to use it.
āHenry Ford
An alarm is an announcement to the operator initiated by a process variable (or measurement) passing a defined limit as it approaches an undesirable or unsafe value. The announcement includes audible sounds, visual indications (e.g., flashing lights and text, background or text color changes, and other graphic or pictorial changes), and messages. The announced problem requires operator action. An alarm is a construction by which an aspect of manufacturing operation is identified and configured in a binary way to be either āin alarmā or āclearedā (i.e., not in alarm). The condition of in alarm is passed to an operator via intrusive sounds and notices placed on video display units or other devices to gain attention. The operator can manage these sounds and notices only via specific āsilence the alarmā or āacknowledge the alarmā actions using the existing, planned infrastructure of the alarm platform. Usually, this alarm platform is an integral part of the process control system (PCS) infrastructure.
The PCS alarm system is a vital and productive tool for managing industrial process control plants. Through several unique cooperative endeavors, industry has identified a best practice for alarm system design. This design utilizes configuration changes, alarm reprioritization and balancing, alarm reductions, graphics modifications, online filtering, and decision support aids. Alarms perform the vital function of operational integrity monitoring. Properly designed alarms will notify the operator of abnormal situations with enough time to successfully manage them.
Figure 1.0.1. Alarms are an intrusive notification to the operator
1.1 KEY CONCEPTS
Alarms are for the operator | The alarm system must be off-limits for all plant uses that do not directly require the operator to process actively the situation or information. |
All of the objectives of alarm improvement are just good engineering | There is nothing that alarm improvement asks of plant operators that is over and above what constitutes effective plant design and operation. Somehow, bits and pieces are overlooked or shortcuts are taken. Poor or inadequate alarm performance is just the way we find out about these things. The technology of alarm improvement provides a focused, compact way of getting that job done. |
Alarm redesign is based on important fundamental concepts | Alarm redesign is based on four powerful concepts: only notify important conditions, notify in time, respond, and provide guidance. |
Initial alarm system performance matters little | Few, if any, unimproved alarm systems have been designed to meet the fundamental concepts; therefore, reducing alarm annoyance and activation rates only treats symptoms of a nonperforming design. |
Improving alarms alone will not provide enough benefit | Good alarm systems only work when the entire plant infrastructure supports good operation. |
1.2 ALARM PERFORMANCE PROBLEMS
Ask almost any operator if the alarm system is working for or against good process operation. The response you are likely to get is surpriseānot because the question is unclear but because it took you so long to ask. Ask control engineers or process engineers that same question and they are more and more often going to say that the alarm system needs fixing. Ask experts in industrial accident investigation and they will tell you that the lack of adequate performance of the alarm system is contributory to a significant number of industrial accidents and major calamities. They will quickly suggest that you make plans to evaluate your PCS alarm system.
Symptoms
Right off the bat, there are clear indicators of alarm problems. If your site has any of these symptoms, there is cause for concern. If you can observe three or more of them, there is much serious work to be done.
⢠Alarm activations occur without need for operator action.
⢠There is no plantwide philosophy for the alarm system.
⢠There are no clear guidelines for when to add an alarm and how to do it.
⢠There are no controls for removing existing alarms.
⢠Operating procedures are not tied to alarm activations.
⢠When alarms activate, the operator is not always sure what to do about them.
⢠Seemingly routine operations produce a large number of alarm activations that serve no useful purpose.
⢠Minor operating upsets produce a significant number of alarm activations.
⢠Significant operating upsets produce an unmanageable number of alarm activations.
⢠Some alarms remain active for long periods of time.
⢠When nothing is wrong, there are active alarms.
Evidence
We look for evidence of alarm problems in four places. Two are quite objective; two are subtle.
⢠Number of alarms configured in the PCS database. How many tags are alarmed? How many alarms are configured for each alarmed tag? How close are the alarm limits to actual process limits? How closely is priority matched to actual operational risk?
⢠Number of alarm activations. What is the hour-by-hour average occurrence of alarms? How often do alarm floods occur? How many alarms occur during a flood? What is the distribution of alarm priorities during floods?
⢠Operatorsā ability to gain insight and guidance from the alarm system prior, during, and after an upset. If alarms are not active, how sure are the operators that the process is normal? When alarms activate, do they provide assistance for the operator to diagnose and remedy problems? Does the alarm system itselfāby excessive activations, lack of proper activation, or existence of meaningless active alarmsāinterfere with or delay proper production management?
⢠Operatorsā ability to determine how the process is actually performing from other tools and PCS capabilities. What is in place that permits operators to view and understand how the process is actually performing? If there were no alarm system at all, how easily could the operator determine how the process was performing and how close it might be to an abnormal condition?
1.3 REASONS FOR ALARM IMPROVEMENT
No alarm system should be asked to overcome the intricacies and power of the uncontrolled effects of nature or the failed constructs of man. Pandoraās box cannot be closed. Humpty Dumpty cannot be put back together. Yet the future is not grim. History is not written solely for the purpose of conveying the worst from our past. Alarm management is thus charged with aiding and abetting our best efforts for accommodating the worst and setting reasonable courses for recovery. Within these pages, you will find the concepts, ideas, and practical approaches to bring what is possible to you. You are encouraged to recognize that your best will neither prevent nor minimize all the effects of misdirection. You are empowered to believe that your best efforts will yield nourishing fruit. As Larry OāBrien and Dave Woll say in Alarm Management Strategies,
Alarm management is one of the most undervalued and underutilized aspects of process automation today. In most cases, alarm systems do not receive the attention and resources that are warranted. This is understandable, because alarming appears to be a deceptively simple activity. Many plants still use the alarm management philosophy developed by the engineering firm when the plant was built.
As alarm systems become less effective, they diminish the effectiveness of all automation.1
How Alarms Fit into Process Operating Situation
Safety shutdown systems are designed to close down affected plant operations in the unfortunate situation where that operation is too close to an unsafe situation, an environmentally challenging condition, or a mode of operation that threatens the financial integrity of the plant. Once a shutdown occurs, the plant operation is significantly curtailed. This usually results in a loss of production, some equipment damage, and a considerable degree of internal investigation. Prior to restart, there might be repairs, changes to equipment and procedures, and of course lots of administrative work.
Most plants would choose to avoid a shutdown if it could be done without risk. Abnormal situations rarely present themselves without some warning. But those warning messages and signs are not always picked up by the operator. They are often subtle. Sometimes they are downright abstract or confusing. Rarely are they front and center enough to be found in time and remedied. However, in a well-designed a...