June 2011 Vol. 238 No. 6
Features
A Fresh Approach To Control Room Management And Related Best Practices
This article summarizes a SCADA implementer’s perspective regarding the intent of the Pipeline Hazardous Materials Safety Administration’s (PHMSA) Control Room Management (CRM) rule. In addition, it intends to provide a fresh approach to CRM, describing why companies should use the CRM process to go beyond compliance requirements and implement operating best practices that would significantly enhance operations reliability and pipeline safety.
Control Room Regulations Background
What is the CRM rule? More importantly, what is PHMSA’s intent with this rule? One of PHMSA’s roles is to study pipeline incidents and make policy recommendations for improving pipeline safety.
PHMSA has identified “people” as a critical element in pipeline safety. People are often involved in preventing pipeline events – sometimes causing them, sometimes worsening them and always striving to mitigate their adverse effects.
In investigating incidents, PHMSA has found that a pipeline controller (controller as used herein is defined as the person seated in the chair facing the control system) may be qualified but is not always successful in managing abnormal situations or events. In fact, the controller’s ability to manage abnormal situations may be influenced by ineffective procedures, fatigue or even limitations in the SCADA system itself. To provide a balance between systems, implementation and procedures that help controllers be more successful, PHMSA offers the strategy “Prevention through People.”
There are two parts to the rule: 1) Part 192 (gas) and 2) Part 195 (liquids). The rule passed in December 2009 and became effective Feb. 1, 2010. To satisfy this rule, owners and operators with pipeline systems managed by controllers using a SCADA system must have a compliance plan completed by Aug. 1, 2011 and implemented by Feb. 1, 2013. However, a newly proposed rule would accelerate the implementation date to Aug. 1, 2011 for most items with a full implementation by Aug. 1, 2012. At the time this article was written, the accelerated rule was still under consideration.
A Performance-based Rule
The CRM rule is primarily a performance-based standard. In general, the rule identifies what is required but not how to meet the requirement. To comply, pipeline operators must: 1) provide effective operating and maintenance procedures, including specific requirements for training; 2) match the control room environment and equipment (including software, displays, alarm processing, furniture, lighting, noise, etc.) to human capability; and 3) provide controllers with warnings and guidance when abnormal operations occur.
This last requirement presents one of the greatest challenges to pipeline operators – demanding solutions and systems that differ significantly from anything they have in their control rooms today.
To meet the performance-based requirement of the rule, the operator should create written policies and procedures around control room operations. These written procedures must define exactly what controllers do in normal, abnormal and emergency operating conditions. The rule also requires controllers to have a formal and auditable mechanism for control hand-off during shift changes. Shifts should be organized to deal with fatigue, and both controllers and the people who work with them require training about fatigue and how it is mitigated.
Management Of Change
To achieve compliant Control Room Management, pipeline operators should establish and enforce effective change management. A change management plan ensures that all workers are invested in – and continually work toward improving – the success of systems and procedures. Operators must study all incidents, understand what did and did not work and then proactively bring “lessons learned” back into the control room plan.
From a systems and tools standpoint, EnerSys maintains there are three essential aspects to this rule from an implementation standpoint: 1) the first aspect relates specifically to software and tools that support the process, particularly addressing the way a SCADA system delivers information to controllers; 2) the second factor focuses on the tools required to implement the procedures, the handover and all change management; and 3) finally, there are the human factors issues related to lighting, air circulation, traffic and distractions – all part of the physical control room environment.
Guidelines Vs. Guidance
API has released a number of recommended practices. API 1165 (Displays), API 1167 (Alarm Management) and API 1168 (Control Room) are published. It should be noted that API 1167 was only recently published. These practices provide high-level guidance but are not prescriptive. For example, API, AGA and others all have fatigue management plan guidelines tied to basic science, but they are inconsistent in their recommendations.
PHMSA has stated publicly that it does not want to prescribe how pipelines operate under the CRM rule as each pipeline operator can best make its own decisions. Further, as a SCADA implementer, EnerSys appreciates the fact that procedures often don’t provide specific instructions on how to build and deliver control systems that support controllers. Rather, they offer guidance. Therefore, considerable invention is often required.
Having experienced a number of other regulatory changes over the years, EnerSys began seeking alternatives. Specifically, we looked for experts who knew the rationale and science behind the standards to help us craft our solutions. Our approach was to form relationships with key outside experts and bring them into our process. One such expert is Doug Rothenberg, a Ph.D. engineer who sits on the API 1167 as well as the ISA-18.2 committees on alarm management. For more than 20 years, he has worked in alarm management and incident investigation for the power, petrochemical, refining and process industries. His textbook on alarm management includes detailed methodology on how to implement an effective alarm management program.
Another key expert is Ian Nimmo, who sits on similar committees related to control room design and operator displays. He is co-author of a textbook on High-Performance Human Machine Interface and has worked on human factors issues in various industries.
Our approach incorporates the best practices of Rothenberg and Nimmo – including their textbooks and methodologies – and adapts them to the implementation required in the pipeline industry. Combining the pipeline operations experience of EnerSys with the CRM experience of Rothenberg and Nimmo creates a unique understanding of not only the requirements, but also the “why” behind the requirements.
Many pipeline operators will likely spend substantial time and effort reworking their control rooms and systems to comply with the rule. EnerSys believes that if the pipeline operator is going to exert that effort, he or she might as well move all the way to operations best practices. More to the point, the level of effort required to move from strict compliance to a best practice is minimal if you adopt the change early in the process.
Situational Awareness
Fundamental to our approach is the understanding of two key concepts. The first is situational awareness. Originally a battlefield science, situational awareness as a human machine interface science evolved from World War II aviation experiences.
At the beginning of WWII, the Army Air Corps was losing more pilots in training than in combat. To determine why this was happening, the Army Air Corps formed a team of specialists, including aerospace engineers, training experts, psychologists, pilots and maintenance personnel. They discovered that the cockpit of each of the various aircraft used for training was set up differently.
Typically, pilot trainees started in one aircraft cockpit and moved to a different aircraft. As trainees transitioned between aircraft, they lost situational awareness because controls, instruments and radios would be positioned inconsistently and systems worked differently.
This study highlighted key factors, or instruments, needed to fly an airplane. These became known as the “sacred six.” Today, you can go into any aircraft cockpit anywhere in the world – any kind, size or shape – and the flight instrumentation setup is always the same.
How does situational awareness (SA) relate to pipeline operations? Over the past 15-20 years, control rooms have evolved from a collection of analog gauges, alarm light panels and switches to an array of computer consoles.
Many of these computer consoles are designed by process engineers, not operations personnel, even though operators look at a process differently than an engineer does. As a result, control systems are frequently not built for the needs of the controller.
In an old-style control room, operators would spend eight hours a day, 40 hours a week surrounded by analog gauges, alarm lighting panels, chart recorders and manual control switches.
They would come to know the operating condition of the system based solely on the position of the switches, alarm panel patterns and dial positions. Also, because they knew how all systems looked in normal operations, they could immediately tell if operations were normal or abnormal. Simply scanning the room for familiar patterns would give them SA, with no major cognitive load required.
Modern technology has given us flexibility and the ability to change, but cost us the benefit of our innate human ability to recognize patterns. To effectively implement API 1165, we first drive toward optimizing situational awareness for the controllers. EnerSys has adapted Nimmo’s methodology to implement intelligence into the graphics and minimize visual and cognitive load. This required a radical rethink of how we’ve typically built SCADA displays. Today, structuring control room systems requires more thought, but delivers substantially greater benefit.
Permission To Operate
The second key concept in EnerSys’s approach is permission to operate. The premise, taken from Rothenberg’s work, is that you maintain permission to operate as long as situational awareness is maintained. When you lose situational awareness – meaning: “I don’t know what’s going on with the process” – then you no longer have permission to operate and need to return to a safe operating condition. This could require shutting down or changing operating parameters, depending on the process.
Typically, one might use alarming to keep abreast of operating conditions, but most alarming implemented in this context is really “alerting.” In the CRM context, the appropriate term is “safety-oriented alarm” – something telling you that the process is entering into an abnormal operating condition. If the process is approaching an abnormal operating condition, the controller may lose situational awareness. The controller needs this advance notice so he or she can take effective action and gain control of the process before situational awareness is lost.
Our recommended best practice would be to define an alarm as something requiring controller action. Further, EnerSys recommends codifying all alarms and making them available to the controller to improve his or her ability to accurately respond. Because field personnel may be needed at a site to handle any manual portions of the process, the amount of time that elapses is critical to the success of this entire process.
Frequently, a process moves from normal operations into abnormal operations before the controller receives any notification. Then, the process must exceed some limit to raise an alarm. Once the alarm is raised, the controller identifies the cause, determines the solution, takes corrective action and waits for the process to respond.
Retaining permission to operate requires implementing alarm management in such a way that alarms are clear and unambiguous. Systems must be designed so the controller will not act too early when a problem arises but has sufficient time to diagnose and determine what specific action is required, with enough of a cushion for the process to respond.
To accomplish this, EnerSys uses Rothenberg’s methodology to create an alarm philosophy that defines alarm types and an operating approach. Once a philosophy is established, every alarm in the system is analyzed and a specific alarm rationalization is captured. The resulting alarm rationalization database is then made available to the controller as part of the console.
This database gives the controller specific information for every alarm in the system, advising him or her of potential causes, how it is confirmed and the consequences of not taking timely action. He or she also knows what processes, if any, are performed automatically, and the manual corrective actions that need to be taken. When implementing this process, companies often find that the amount of time needed to take an action may increase, but the amount of time required to get to the correction is reduced. So, you spend more time in analysis – but less time acting – and you decrease your overall number of alarms.
Once you understand the process, the next requirement in complying with the rule is a focus on the environment. One aspect of the rule concerns the environment itself, ensuring that you build it in a way that will not cause eye strain and cognitive overload or contribute to fatigue and distraction. This is critical to all environments – large or small. Compliance varies depending on the kind of process you have and your particular operating philosophy. In addition, a major challenge for midstream operators is the wide geographic distribution of their control rooms.
Considering SCADA. Historically, we think of SCADA as communications and wiring bringing signals to a display. As an industry, we haven’t given much thought to the operator interface. SCADA is a complex toolset – including multiple software components and communications technologies, complex networking and field equipment. To have a firm grasp on the entire system, the pipeline operator needs a well-documented data flow and infrastructure.
Classic SCADA Dataflow
Let us say we drew a picture of most companies’ SCADA systems and ask the question, “What’s missing from a CRM standpoint?” What we’d probably find missing is a way to record shift handover and controller’s logs, as well as processes for alarm management, simulation training, leak detection modeling, commissioning support and change management. In addition, some rework is usually required – both in the way data is displayed on the HMIs and the way the alarms are processed. Some of these changes may be strictly procedural, some may entail software updates – and some may require a combination of both.
SCADA Dataflow With CRM
Final Thoughts
We have one over-riding message for anybody operating a pipeline control center: you will need to transform what you’re doing. Although achieving compliance may be adequate, it misses a huge opportunity. Addressing this new rule offers you a prime opportunity to take a structured approach to best practices that will significantly improve your operations.
In studying other industries, Nimmo and Rothenberg have observed such additional benefits as reduced insurance costs, less machinery wear and more efficient operations. When a controller encounters alarms conditions less frequently and clears them more effectively, he or she doesn’t work the equipment as hard and has more time to focus on operational efficiency. As a result, the researchers found strong ROI in other industries that have implemented best practices.
Although the pipeline industry has not implemented best practices long enough to develop those numbers yet, we are analyzing this direction. The requirements will drive a rework of displays, alarms and procedures. Doing this incrementally will require a very thoughtful approach to the discipline methodology.
EnerSys is actively looking for people to work with us around our methodology and we’re performing gap analyses as a starting point. These gap analyses reveal where you are now with regard to the rules, identify the appropriate approach to reach compliance with the requirements and ultimately will lead you to the implementation of best practices.
Acknowledgment
This article is based on a presentation by Russel W. Treat at ENTELEC 2011 Conference & Expo May 24-26, 2011 in Houston.
Author
Russel Treat, B.S. and M.B.A., is founder and president of EnerSys Corporation. He has more than 20 years of experience in system engineering, including business process analysis, market assessment, technology evaluation, development of control systems strategy tied to business objectives, custody transfer measurement and project management. He is a registered professional engineer in Texas and Minnesota and is a past president of the Gulf Coast Gas Measurement Society. He is a member of the General Committee for the American School of Gas Measurement Technology and is a director for the Energy Telecommunications and Electrical Association (ENTELEC). He participates in the Greater Houston Partnership’s CEO Roundtables. He can be reached at 281-598-7100 or rtreat@enersyscorp.com.
Comments