Industrial Process Safety

Industrial Process Safety
Lessons from major accidents and their application
in traditional workplace safety and health
Graham D. Creedy, P. Eng, FCIC, FEIC
Formerly Senior Manager, Responsible Care®
Canadian Chemical Producers’ Association
(now Chemistry Industry Association of Canada)
[email protected]
System Safety Society Spring Event
May 26, 2011
Overview
• How I got into this
• The evolution of the philosophy of
industrial safety and prevention of major
accidents
• Some key insights and concepts
• How these apply to management of
workplace safety in various sectors and at
different levels of the organization
2
Some history
• 1984 Bhopal accident is wake-up call to
chemical industry
• Industry responsibility to understand and
control hazards and risks
• Responsible Care launched in Canada
– Principles, codes, commitment, tools, support,
progress tracking, verification
• Major Industrial Accidents Council of
Canada 1987-1999
3
Safety Performance by Industry Sector
Injuries & illnesses per 200,000 hours worked (2002)
Services
Finance, insurance & real estate
Wholesale & retail trade
Transportation & public utilities
Petroleum and coal products
Chemicals and allied products
Printing & publishing
Pulp & paper
Textiles & apparel
Food & food products
Transportation equipment
Electronic and electrical equipment
Industrial machinery & equipment
Primary metal industries
Construction
Mining
Agriculture, forestry & fishing
0.0
2.0
4.0
6.0
8.0
10.0
12.0
Source: US Bureau of Labor Statistics (www.bls.gov/iif)
4
Relative risks of fatal accidents in the work
place of selected occupations
Fishers (as an occupation)
35.1
Timber cutters (as an occupation)
29.7
Airplane pilots (as an occupation)
14.9
Garbage collectors
12.9
Roofers
8.4
Taxi drivers
8.2
Farm occupations
6.5
Protective services (fire fighters, police guards, etc.)
2.7
“Average job”
1.0
Grocery store employees
0.91
Chemical and allied products
0.81
Finance, insurance and real estate
0.23
Sanders, R.E, J. Hazardous Materials 115 (2004) p143, citing Toscano (1997)
5
Chemistry Industry
Association of Canada
Member Performance
CIAC website
www.canadianchemistry.ca
Staff contact: Stephanie Butler
613-237-6215 x 245
6
Incident Pyramid:
1
Serious/Disabling/Fatalities
10
Medical Aid Case
30
Property Loss/1st Aid
Treatment
600
Near Misses
Unsafe Behaviors/Conditions
10,000
A “proactive” approach focuses on these
categories, but be careful – you may
miss the really serious ones!
7
Terminology
• Process hazard
– A physical situation with potential to cause
harm to people, property or the environment
• Risk (acute)
– probability x consequences of an undesired
event occurring
8
They thought they were safe
• “Good” companies can be
lulled into a false sense of
security by their
performance in personal
safety and health
• They may not realise how
vulnerable they are to a
major accident until it
happens
• Subsequent investigations
typically show that there
were multiple causes, and
many of these were known
long before the event
BP Deepwater Horizon
9
Why and how defences fail
• People often assume systems work as
intended, despite warning signs
• Examples of good performance are cited as
representing the whole, while poor ones are
overlooked or soon forgotten
• Analysis of failure modes and effects
should include human and organizational
aspects as well as equipment, physical and
IT systems
10
Process safety management
• Recognition of seriousness of
consequences and mechanisms of
causation lead to focus on the process
rather than the individual worker
• Many of the key decisions influencing
safety may be beyond the control of the
worker or even the site – they may be
made by people at another site, country
or organization
• Causes differ from those for personnel
safety
• Need to look at the whole – materials,
equipment and systems – and consider
individuals and procedures as part of the
system
• Management system approach for
control
Flixborough, Bhopal, Pasadena
11
Scope
(elements of process safety management)
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
Accountability
Process Knowledge and Documentation
Capital Project Review and Design Procedures
Process Risk Management
Management of Change
Process and Equipment Integrity
Human Factors
Training and Performance
Incident Investigation
Company Standards, Codes and Regulations
Audits and Corrective Actions
Enhancement of Process Safety Knowledge
CCPS: Guidelines for Technical Management of Chemical Process Safety
12
Functions of a management system
Planning
Leadership
Results
Organizing
Structure
Measurement
Controlling
Direction
Implementing
CCPS: Guidelines for Technical Management of Chemical Process Safety
13
Features and characteristics of a management
system for process safety
Planning
Explicit goals and objectives
Well-defined scope
Clear-cut desired outputs
Consideration of alternative achievement
mechanisms
Well-defined inputs and resource
requirements
Identification of needed tools and training
Implementing
Detailed work plans
Specific milestones for accomplishments
Initiating mechanisms
Organizing
Strong sponsorship
Clear lines of authority
Explicit assignments of roles and
responsibilities
Formal procedures
Internal coordination and communication
Controlling
Performance standards and
measurement methods
Checks and balances
Performance measurement and reporting
Internal reviews
Variance procedures
Audit mechanisms
Corrective action mechanisms
Procedure renewal and reauthorization
CCPS: Guidelines for Technical Management of Chemical Process Safety
14
Strategic
Managerial
Planning
Planning
Task
Planning
Organizing
Organizing
Implementing
Organizing
Controlling
Implementing
Controlling
Controlling
Examples of PSM management systems concerns at
different organizational levels
CCPS: Guidelines for Technical Management of Chemical Process Safety
15
Self-assessment of Current Status
Process Safety Management
Requirements to Achieve the ESSENTIAL Level
For each survey question, indicate the level of awareness and use at the site by marking the appropriate box, based
on the following:
A page from the
Site SelfAssessment Tool
A
Widespread and comprehensive use wherever significant hazard potential exists.
B
Moderate use, but coverage is uneven from unit to unit or not comprehensive in view of potential
hazards.
C
Appropriate personnel are aware of this item and its application, but little or no actual use.
D
Little awareness or use of this item.
Mark the box labeled "Help" if this is an item where you are in urgent need of guidance. We’ll have a team member
contact you with advice on how and where to get the information or help.
1.
(b) Is there a system for control of contractor operations?
(b) Is there current comprehensive documentation covering the process
operating basis, including both normal and abnormal conditions?
   

   

   

   

   


   
   

   


   
   
Process Safety Review Procedures for Capital Projects
(a) Are all project proposals for new or modified facilities subjected to
documented hazard reviews before approval to proceed?
(b) Are systems established to ensure that the facility is built as designed?
(c) Is there an effective link between design modifications and operating
procedures?
4.

Process Knowledge and Documentation
(a) Are the safety, health and environmental hazards of materials on site
clearly defined?
3.
Current Status
A
B
C
D
Accountability: Objectives and Goals
(a) Are responsibilities clearly defined and communicated, with those
responsible held accountable?
2.
Want
Help
Process Risk Management
(a) Is there a system, conducted by competent personnel, to identify and
assess the process hazards from materials present at this site?
(b) Are corrective actions defined and implementation followed up?
(c) Are the above items formally documented?
16
Use of self-assessment tool for
collective progress reporting and action
As of August 29, 2008 compared with past five years
(some site changes)
Target for meeting Essential level: June 30, 2003
Excellent
160
140
Enhanced
120
100
Essential
80
60
Almost at Essential
40
"In Progress"
20
0
2002
2003
(137 sites) (141 sites)
2004
2005
(134 sites) (143 sites)
2006
2007
(139 sites) (145 sites)
2008
(129 sites)
17
Process-Related Incident Measure (PRIM) 2007
Findings: All Elements
Incidents Analyzed
PRIM INCIDENT CAUSE ANALYSIS 1998/1999 TO 2007
40
35
30
25
20
15
10
5
0
98/99
2000
2001
2002
2003
2004
2005
2006
2007
.
n
s
rs
nt
ge
ce
ge
al s
m.
rity
gs
tio
i gn
o
e
n
n
u
o
t
g
s
e
a
ion
ed
a
a
c
c
t
m
e
l
e
G
g
t
R
i
h
c
o
a
e
t
m
w
/
r
F
C
g
In
A
s
&D
no
s&
&D
rfo
nt
es
ve
of
na
ve
an
K
i
d
e
e
w
n
t
a
ve
t
e
I
i
o
y
m
e
n
P
t
c
g
m
i
t
t
M
c
&
/C
Hu
rre
ev
en
me
led
ui p
afe
isk
g
o
bj e
d
e
R
s
7
i
q
w
S
n
R
g
t
d
i
C
c
O
r
c
s
a
n
n
:
s
no
&E
da
es
an
rai
s&
9I
es
oje
s
n
t
c
ility
r
sK
i
c
T
M
s
a
o
b
s
d
t
o
r
e
8
5
e
lP
ta
Pr
Au
tP
yS
ita
roc
un
4
roc
n
n
1
p
P
o
e
P
a
1
a
6
p
2
cc
em
3C
om
1A
nc
C
a
h
10
En
2
1
PSM Element Possibly Involved
18
Assessing an organization’s safety effectiveness
• What is the safety policy and culture (written,
unwritten)?
• How are the following handled?
– Establishing what has to be done
• Benchmarking
• Communicating
• Assigning accountabilities
– Ensuring that it gets done
• Monitoring and corrective action
• Evidence (documentation) and audit process
• Resourcing – not only for ideal but for anticipated
conditions
• Balancing with other priorities
• How are exceptions handled?
19
Consider targets in groups
• Those who:
– Don’t care
– Don’t know (and perhaps don’t know that they
don’t know)
– Did know, but may have forgotten or could
have gaps in application (and perhaps don’t
realize it)
20
Excellent guidance
exists – but how is it
being used?
21
Percent
adoption
The New Product Introduction Curve
• Can be applied to adoption of new ideas
• Categories differ by ability and more importantly, motivation
22
Accountability
• Management commitment at all levels
• Status of process safety compared to other
organizational objectives such as output, quality and
cost
• Objectives must be supported by appropriate resources
• Be accessible for guidance, communicate and lead
23
Management of Change
•
•
•
•
•
•
Change of process technology
Change of facility
Organizational changes
Variance procedures
Permanent changes
Temporary changes
24
Process and Equipment Integrity
• Design to handle all anticipated conditions, not just ideal
or typical ones
• Make sure what you get is what you designed
(construction, installation)
• Test to make sure the design is indeed valid
• Make sure it stays that way
– Preventative maintenance
– Ongoing maintenance
– Review
• Be especially careful of automatic safeguards
25
• Consider operator as
fallible human
performing tasks in
background
• Design for error
tolerance, not just
prevention
– detection
– correction
Buncefield, UK
26
Realization of significance of sociocultural factors in
human thought processes and hence in behaviours
27
Human behaviour aspects
• People, and most organizations, don’t
intend to get hurt (have accidents)
• To understand why they do leads us
eventually into understanding human
behaviour, both at the individual and
organizational level, and involves:
Familiarity to
engineers
More
– Physical interface
• Ergonomics
– Psychological interface
• Perception, decision-making, control actions
– Human thought processes
• Basis for reaching decisions
• Ideal versus actual behaviour
– Social psychology
• Relationships with others
• Organizational behaviour
Less
28
Human behaviour modes
•
Instead of looking at the ways in which people can fail, look at how they
function normally:
• Skill-based
– Rapid responses to internal states with only occasional attention to
external info to check that events are going according to plan
– Often starts out as rule-based
• Rule-based
– IF…, THEN…
– Rules need not make sense – they only need to work, and one has
to know the conditions under which a particular rule applies
• Knowledge-based
– Used when no rules apply but some appropriate action must be
found
– Slowest, but most flexible
29
SSAP
The ‘Swiss cheese’ model of
organisational accidents
2
Hazards
Some holes due
To active failures
Other holes due to
latent conditions
Losses
Successive layers of defences
Reason’s “Cheese Model”
James Reason - The Management of Safety, SSAP Launch Event 17/02/2004
James Reason, presentation to Eurocontrol 2004
30
Active and latent failures
• Active
– Immediately adverse effect
– Similar to “unsafe act”
• Latent
– Effect may not be noticeable for some time, if at all
– Similar to “resident pathogen”. Unforeseen trigger conditions
could activate the pathogens and defences could be undermined
or unexpectedly outflanked
31
A Classic Example of a Latent
Failure
• Hazard of material
known, but lack of
awareness of potential
system failure mode
leads to defective
procedure design
through management
decision
Epichlorhydrin fire,
Avonmouth, UK
32
And another
Danvers, MA, Nov 2006
Solvent explosion at printing ink factory
• Hazards known, but defences
compromised by apparently benign
change
• Latent error in procedure design
creates vulnerability to likely
execution error
US Chemical Safety Board
33
And another
• Hazard of material
not obvious (despite
history)
• Latent error allowed
dust to accumulate,
creating conditions
for subsequent
events
Scottsbluff, NE 1996
Port Wentworth, GA 2007
34
Lessons from other fields
• Aerospace and nuclear show how significant
human and organizational aspects can be even
where the obvious signs of failure are technical
in nature
• Finance shows:
– Relevance of such factors without technical
distractions
– How fast a system can deteriorate once controls are
relaxed
– How wrong risk assessments can influence bad policy
decisions
35
Relevance of organizational factors
“The relevance of organizational factors has also been
graphically and tragically revealed in the inquiry reports of recent
UK transportation and offshore oil disasters.
Prior to ..., senior managers in all the organizations
propounded the pre-eminence of safety. They believed in the
efficacy of the regulatory system, in the adequacy of their existing
programs, and in their confidence of the skills and motivation of
their staff.
The inquiry reports reveal that their belief in safety was a
mirage, their systems inadequate, and operator errors and
violations commonplace.
The inquiry reports stated that ultimate responsibility lay
with complacent directors and managers who had failed to ensure
that their good intentions were translated into a practical and
monitored reality. Moreover, the weaknesses so starkly revealed
were not matters of substantial concern to the regulatory
authorities before the accidents.”
HSC, 1993
36
Factors that can influence
likelihood of failure
• Organizational culture
– “the way we do things around here – when no-one is
looking”
– increasingly being recognized as one of the most
important factors in major accidents
– perceived balance between output, cost and safety is
heavily dependent on this culture, and influences
whether personnel work in a certain way because
they believe the company and their co-workers feel it
is the right way to do things, or whether they are
simply “going through the motions.”
37
In general, safety gets better as society learns more
Standard
of Safety
Time
38
But the rate of improvement is not steady
Standard
of Safety
x 10
Time
39
In fact, the curve can be one of periodic rapid gains
followed by gradual but increasing declines
Note how the rate
of decay can be
expected to
increase due to
normalization of
deviance
Standard
of Safety
x 100
Time
40
Organizational Culture Model
James W. Bayer, Senior VP Mfg, Lyondell Chemical Company
Strong
Tribal
Operational
Excellence
Chaotic
Bureaucratic
People
Weak
Systems
Strong
41
Preservation – or loss – of corporate memory
• Demographic effects
–
–
–
–
Less staff
Experienced cohort leaving or left
Skills transfer senior > (middle) > junior
Replacements understand the way something is
done, but not why it is done that way, the potential
consequences of doing it differently and how to detect
and recover from undesired actions
• “We are starting to see lowered standards of
design and supervision that fifteen years ago
would have been unthinkable in the chemical
industry” (Challenger, 2004)
42
• What does an organization’s investigation
of its failures reveal about its:
– Culture
– Management system?
43
• Knowledge
– Never realized problem could occur (benchmarking error)
• was it treated as a unique deficiency?
• was there a broader review of the benchmarking process to find if there are
other areas where knowledge could be deficient?
• Policy
– Thought situation would be acceptable but didn’t realize full implications
until it happened
• Does it appear to be acceptable now?
• Was review of policy and accountability limited or broad in scope?
• System design
– Even if everything had been done as intended, problem would still have
occurred
• How comprehensive was analysis of system deficiencies and practicality of
solutions?
• How effective is action plan and follow through?
• Was review of system design limited or broad in scope?
• System execution (management system error)
– Problem occurred because someone or something did not perform as
intended
• Did analysis consider why execution not as intended?
• Was corrective action appropriate and balanced?
• Was review of system execution limited or broad in scope?
44
Dealing with a Safety (or Engineering) Problem
•
Finding out who you’re dealing with
– Where is the organization on the curve? (generally, and re the specific issue or
problem)
– Where are the people you’re dealing with on the curve? (generally, and re the issue
or problem)
•
Finding out what to do
– “Benchmark” – don’t try to reinvent the wheel unless you’re sure there isn’t one
already (or you’ve time and it’s fun to do so)
– Find out what others are doing about it
– Read the instructions
– Identify/define the issue
– If it’s likely to be regulated, check with government agencies, trade associations,
web, internet
– If not regulated but likely good industry practice, check suppliers, other users of
same material or item, other users of similar items, other industry contacts – but
test the info!!! (cross-check, ask if it makes sense)
– Check standard reference works, (Lees, CCPS, etc)
•
Doing it
– Try to think of all situations that are likely to occur (process, eqpt, people)
– “KISS”, keep it user-friendly, show basis for decisions if practical to do so
– Follow up afterwards to see how it’s working
45
Questions?
46