System Hazard Platform: Case Study NASA Field Joint Failure

NASA became overconfident with consecutive successful flights with no major failures leading up to Flight 25 of the Space Shuttle Challenger and failed to correctly apply quality assurance to reanalyze the possibilities of failure when extreme cold weather was present for what would become the last Challenger launch. System Hazard Analysis applied correctly to analyze the failure rate patterns of the NASA Space Shuttle Challenger Solid Rocket Booster field joints may have prevented the launch of the tragic Flight 25, where there was a total loss of aircraft and seven astronauts were killed in the accident. The steps of System Hazard Analysis will be explained that if followed may have provided the data necessary for NASA to correct the field joint error prior to instead of after the Challenger explosion.


Introduction
Quality Assurance (QA) is a systematic process to inspect and evaluate a product that is being developed and manufactured to ensure it is meeting requirements.It is an effort by the entire production team to prevent mistakes or to discover defects prior to a product's release for use in the field.Every member of a production team should be trained in the importance of applying quality assurance techniques to their product.In the last 100 years, various QA techniques in production have been developed and improved to be used by companies to create a final product with little to no defects.Developing QA in a company involves the correct interpretation of a customer's inputs to create a product that the customer is happy with by creating the proper standards and process control in design, development, and manufacturing.Communication amongst all internal and external stakeholders is also a very important requirement to detect and repair defects in a product.In one of the most memorable tragedies in U.S. history, quality assurance was not applied correctly by NASA management prior to the Flight 25 launch of the Space Shuttle Challenger.On the eve of the launch day of the Challenger the ambient temperature at the Kennedy Space Center (KSC) was forecasted to be below or near freezing for the scheduled launch.NASA and Morton Thiokol-Wasatch, the manufacturer of the Solid Rocket Booster (SRB), became concerned about the performance of the SRBs in cold weather due to problems noticed in past launches.These concerns were specifically related to the field joints of the SRBs based on post analysis of several previous launches, which showed hot combustion gases had charred and eroded the seals of the field joints (Vaughan, 1997).Field joints mate the case segments of the SRB.The SRB consisted of seven segments that were assembled into four segments by the contractor Morton Thiokol-Wasatch.These segments were then shipped to KSC to be mated or stacked vertically using field joints to join and seal each mated segment.Each cylindrical segment had two different ends forming half of each field joint.The bottom of a segment was called the tang which was inserted into a cavity at the top end of another segment called the clevis.Clevis pins were used to hold the joint together.Each field joint consisted of 177 clevis pins installed evenly spaced around the circumference of the SRB case.Shims were used to align each mating tang and clevis.A seal was formed within the field joint by two compressed rubberlike Viton O-rings installed into two channels cut into the clevis which contact the tang.To protect the O-rings, a type of putty made up of asbestos-filled zinc chromate was used to seal the inner wall area between the mated segments (Davis et al., 1990;Vaughan, 1997).Figure 1 shows the details of the mating parts and seals of a completed field joint used on the Challenger.

Figure 1. SRB Field Joint
The purpose of the field joint was to contain the hot gases while the SRBs were burning fuel and propelling the shuttle and external tank into space during the first two minutes of the launch.The theory was that the putty in conjunction with the redundant compressed O-rings would seal the joints between segments containing the built up pressure of the hot gases being expelled during launch.During testing prior to the Challenger launch a phenomenon called joint rotation was discovered.Joint rotation was the shifting of hardware within the field joint which could create a gap between the tang and clevis inner wall wide enough for the O-rings to lose their compressed seal and allow hot gases to escape.From this testing effort, decisions were made to make the shims bulkier to improve alignment of the tang and clevis and resist shifting of the tang, clevis, and clevis pins.Also, O-rings with a wider diameter would be used.Through more testing of joint rotation and analysis of field joints from early Space Transportation System (STS) launches, it was agreed upon that the design of the field joint was an acceptable risk (Vaughan, 1997).
As STS missions continued in preparation of the Challenger launch some of the SRB field joints showed signs of hot gases protruding the field joints in post analysis.It was revealed that the primary O-ring was being eroded by the high temperature hot gases escaping the putty seal.The secondary O-ring did its job by maintaining the field joint seal, but this issue raised concern amongst the management of NASA and Morton Thiokol-Wasatch.The approved design of the field joint was still not preventing joint rotation and in extreme cases the second O-ring could also fail.This led to an investigation just a few days before the scheduled launch of the challenger.With the frigid temperatures expected at the scheduled Challenger launch the attention of engineers and management turned to the correlation of O-ring elasticity and behavior at cooler temperatures.On the eve of the Challenger launch, it was suggested by Morton Thiokol-Wasatch engineering that the cooler temperatures may cause the O-rings to harden and not correctly compress in reaction to the initial buildup of pressure once the SRBs had ignited at liftoff (Vaughan, 1997).At this point NASA management was under pressure to go ahead with the scheduled launch of the Challenger after multiple postponements received negative media attention and added strain to an already busy launch schedule.NASA was able to negotiate with Morton Thiokol-Wasatch management to agree to go ahead with the launch based on a lack of extreme failures in past launches even though the cold temperature was an extreme case and had never been tested."STS 51-L was launched at 11:38 A.M. EST.The ambient temperature at the launch pad was 36ºF.The mission ended 73 seconds later as a fireball erupted and the Challenger disappeared in a huge cloud of smoke (Vaughan, 1997).

Literature Review
Improvements in quality assurance processes have helped prevent major failures like the tragedy of the Space Shuttle Challenger.Companies large and small have been forced to invest in practices to reduce risk in costly failures later in the life cycle of a product.As far back as the 1920's, pioneers of the quality movement like Walter Shewhart began questioning the quality methods of companies and developing new ways to manage the quality of a product.To reduce the frequency of failures, Shewhart introduced a process of statistical control to distinguish between assignable and chance cause variations (Shewhart, 2011)."His mentoring of other engineers at Western Electric and his groundbreaking work with control charts arguably led a quality revolution and launched the quality profession" (Smith, 2009a).Shewhart's influence on his supervisor George D. Edwards would eventually lead him to become the first president of the American Society for Quality Control, which would combine several local societies formed during World War II into a national society to promote quality education (Smith, 2009b).Another pupil of Shewhart, W. Edwards Deming, successfully learned and applied Shewhart's theories.Deming became a master of quality processes producing his famous 14 point philosophy and his teachings which "affected a quality revolution of gargantuan significance on American manufacturers and consumers" (SkyMark, 2015).He was also the main influence in Japan's economic rise in export-led growth (Best & Neuhauser, 2005).Dodge (1943) was able to use Shewhart's statistical process control to create a sampling inspection plan for products manufactured in high quantities in a continuous process.Eventually Harry Romig would partner with Dodge to create the Dodge-Romig sampling plan.Sampling plans played an important role in quality control systems and "the success of the inspection function has profound effects on the viability of the production process and the profitability of the enterprise" (Jaraiedi & Segall, 1990).Yet another pioneer of quality who worked with Shewhart was Joseph Juran.He was credited with adding the human dimension to the already important statistics of quality.He discovered many human relations problems that were linked to resistance to change or cultural resistance (Phillips-Donaldson, 2004).His level of influence on the global improvement of quality was compared to that of Deming.Deming and Juran's visit to Japan influenced Kaoru Ishikawa to become a leader in quality improvement in Japan.Ishikawa believed in quality through leadership allowing management to improve their employees' work habits based on the quality expected of the external customer by educating and instilling a selfless personal commitment (Watson, 2004).An innovator of quality in the fields of engineering economics, industrial engineering, and statistical quality control and who learned from some of the original developers of improved quality was Eugene Grant.He wrote textbooks and helped teach the techniques to some of the quality experts like Deming, Shewhart, and Dodge (Smith, 2012).Feigenbaum (1956) added a new twist to quality when he discussed the principle of a total quality view, which stated that quality effectiveness started with the design of a product and ended when the product was received by the customer.This principle is considered by some to be the initial theory of Total Quality Management.Another expert in quality is Phillip Crosby who focuses on prevention by stressing an attitude of doing things right the first time and zero defects (Suarez, 1992).His efforts of quality improvement are focused on better planning and analysis of requirements early in development in an attempt to get it right from the start.
A newer quality improvement process called Six Sigma was developed by Bill Smith and Bob Galvin in 1986, which established the idea of improving quality so that the number of defects becomes so few that they are statistically insignificant (O'Farrell, 2015).Keller (2011) provides a step-by-step guide to applying core quality processes to techniques like Six Sigma or Lean processes to improve quality.Another advocate of the Six Sigma technique is Thomas Pyzdek, who developed the Pyzdek Institute to train the Six Sigma technique to improve quality through consultation.Pyzdek feels that Six Sigma cannot be compared to traditional Quality Management and that Six Sigma takes quality control to a new level to improve quality (Noorani, 2014).
A few years after the tragedy of the Challenger, Daniel Goldin was hired at NASA who initiated a revolution to transform the space program by establishing his "faster, better, cheaper" approach to deliver programs of high value without sacrificing safety (Thompson & Davis, 2009).Another major contributor to NASA and quality improvement was Paul Lachance who was responsible for ensuring food safety for the astronauts during missions.In an effort to identify food safety hazards and apply critical control points to reduce threats, Lachance became the developer of the Hazard Analysis and Critical Control Point (HACCP) quality control technique (Valigra, 2012).This technique is a form of the System Hazard Analysis technique of defining hazards of systems, subsystems, and the interaction between the subsystems.This technique, if applied correctly, may have prevented the launch of the Challenger until all hazards had been defined and risks from those hazards had been mitigated.

System Hazard Analysis Applied to the STS SRB Field Joint
System Hazard Analysis is used to define the interface effects of subsystems.In the case of the field joint the opposite ends of two SRB segments are being joined together by inserting the tang of one section into the clevis of another section using shims to center the tang in the clevis.High temperature putty is applied to the inner portion of the mated joint to attempt to prevent hot gases from contacting the O-rings.Two grooves or channels are cut into the clevis where the primary and secondary O-rings are seated prior to mating the segments.Once mated a clevis pin is installed which passes through the clevis and tang to lock the segments together.A pressure port is installed to pressure check the seal of the two O-rings after the field joint is complete.Leading up to the launch of the Challenger, historical data had shown that the field joint had minor failures.Investigations of O-rings from past launches showed various levels of erosion of the primary O-ring but the secondary O-ring had always functioned properly and maintained the field joint seal.Based on Figure 2, System Hazard Analysis requires several steps to be completed to efficiently define and mitigate hazards or risks.Table 2 shows historical data from 23 previous launches of the Space Shuttle Challenger before the loss of aircraft of flight 25.One data point was lost due to not being able to recover the SRBs for one of the missions.If the data is split at the 70 • F mark, it shows 6 out of 12 launches had O-ring failures at temp [53,70].The other half of the data shows 1 out 11 launches had O-ring failures at temp [70,81].Although the data trend is somewhat obvious when observing the failures with respect to temperature, using a binomial distribution could be used to provide further analysis of O-ring reliability.
1 for colder temperatures where N = 23, x = 6, and R = 6/12 1 for colder temperatures where N = 23, x = 6, and R = 1/11 Formulas can be used to analyze the probability of failures with respect to temperature in a given amount of launches.The field joint is not considered completely failed until the putty seal and both the primary and second O-rings fail.So the complete failure probability is: using the probability of failure based on temperature from earlier.The tragedy of the Challenger, "triggered an in-depth, detailed review of analytical, design, manufacturing, assembly, and testing methods for solid rocket motor field and factory assembly joints" (Harkins, 1999).NASA overhauled their quality assurance processes to improve hazard analysis and the process to attain safety of flight approval.A new independent organization was created to oversee critical flight safety matters, working directly with teams investigating issues but retaining its separate identity as final arbiter reporting directly to the NASA Administrator (NASA, 1987).Once the organization was established the first priority was to focus on the redesign and testing necessary to repair the SRB field joints.The redesign featured a new tang capture latch that included a third O-ring seal that helped control movement of the mated tang and clevis.External heaters were added to heat the field joints prior to launch to ensure O-ring compression and reaction to structural loads and pressure build was optimal.Also insulation design improvements were implemented to prevent seal leakage from the inner mating walls of the field joint.Figure 3 shows a diagram of the new design of the field joint.

Conclusion
NASA's breakdown in communication and ignorance of a lingering issue led to a historical tragedy causing the loss of seven lives.There was already a great amount of public interest in the Challenger launch with the selection of a school teacher to be launched into space.Combined with the extra attention and the busy launch schedule of that year put extra stress on NASA management.They had received negative media attention, because of the many delays prior to the Challenger launch.On the eve of the Challenger launch issues were brought to the attention of NASA management about possible failures due to extreme cold conditions, but they chose to chance the risk of failure because they were under extraordinary pressures to launch the Challenger (Vaughan, 1997).NASA management chose to ignore their quality standards and to reassess the possible elevated risk.Following a detailed process of System Hazard Analysis may have proven to NASA management that the risk was too high for safety of flight approval.The added pressure to delay another flight would have been welcomed in comparison to the stress of a failure investigation that showed a lack of communication amongst NASA officials and the lack of applying quality assurance processes to reevaluate the level of risk in a last minute issue that affected human lives.For future works, application of virtual reality in Hazard Analysis would effectively pinpoint the risk elements.

Table 1 .
Field Joint Subsystem Hazard Analysis

Table 2 .
Historical Data of Field Joint Failures from previous Challenger launches: