Click for next page ( 16


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright Β© National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 15
3 NASA's Safety Process For The National Space Transportation System Program Before entering into a discussion of the Com- mittee's findings regarding various specific aspects of the process that NASA relies on to ensure the safety of the Space Transportation System (STS), it may be useful to provide a basic overview of the elements and purposes of that process. Reaciers who are already familiar with the structure anct purposes of NASA's present safety process may wish to skip over this "orientation" section and begin reading at Section 4. The measures taken to ensure safety follow basic NASA policyrissuect at the Administrator level. The implementation of that policy is guided ant] over- seen by clescencling levels of management through- out NASA Headquarters ant] the NASA field cen- ters ant! their contractors involves] in STS development ant] operation. Various organizations within NASA have different anc! overlapping sets of responsibilities with respect to safety of the STS. At the heart of the safety process is a set of analyses of the system configuration ant] function. NASA's activities in the safety area since the Challenger (51-~) disaster occurred have centered] on these analyses and on the needed engineering changes in the STS system which the analyses have helped to identify. This section is intended to be only a factual description of NASA's safety process, with empha- sis on policy and structure (as perceives! by the Committee). The Committee's analysis ant! com- ments are presented beginning in Section 4. 3.1. POLICY ON SAFETY NASA policy regarding safety is established by the Administrator through NASA Policy Directive 15 (NPD) 1701.1, "Basic Policy on Safety." The pur- pose of this document is to prescribe "the basic policy for planning, developing, conducting, and evaluating agency activities to ensure the highest practicable standards of safety in all NASA pro- grams." The essence of the policy is to: 'a. Avoid loss of life, injury of personnel, damage and property loss. "b. Instill a safety awareness in all NASA employees and contractors. c. ~ L -. 1 U Assure that an organized and systematic approach is utilized to identify safety hazards and that safety is fully considered from conception to completion of all . . . agency activities. Review and evaluate plans, systems, and activities related to establishing and meeting safety requirements both by contractors and by NASA installations to ensure that desired objectives are effectively achieved." The accompanying NASA handbook (NHB ~ 700. ~ (VI]) states that " . . . the steps necessary to achieve safety of operations begin with initial planning ant] extend through every facet of NASA's activities. Uncler this concept, every manager thoughout the organization is responsible for systematically i(len- tifying risks, hazarcls, or unsafe situations or prac- tices, ant! for taking steps to assure adequate safety in the activities and products under his supervi- sion." Out of this broad policy framework are clerivec] the more specific safety requirements that are implemented in successively greater detail down through Headquarters, program and project or- ganizations at the NASA centers, and contractor . . Organlzatlons.

OCR for page 15
6 3.2 MANAGEMENT STRUCTURE 3.2.1 Program Management The development and operation of the STS is carried out through a National Space Transpor- tation System (NSTS) Program. This Program ciraws on resources fur~ctionally located at three of the NASA field centers. Prior to the Challenger mission 5 I-L the NSTS Program was manager! out of Johnson Space Center (JSC), in Houston; JSC is also responsible for the Orbiter element of the STS as well as the integration of all STS elements. Marshall Space Flight Center (MSFC), in Alabama, is responsible for the propulsion elements of the STS: the Space Shuttle Main Engine (SSME), Solic] Rocket Booster (SRB), which includes the Solid Rocket Motor (SRM), and External Tank (ET). Kennedy Space Center, in Florida, is responsible for major ground support equipment (GSE), and launch ant! landing operations. After mission 51-L, the NSTS Program Director was brought to NASA Headquarters (Leve! I) to manage the program from a location closer to top agency officials and at a level which has oversight of all three field centers. The Deputy Director (Program) of the NSTS Program remains at JSC; the recently established position of Deputy Director DIRECTOR NSTS (HQS) l . DEPUTY PROG (JSC) DEPUTY OPNS (KSC) , ' ' 1 JSC MSFC KSC VLS ..._ PROJECT MANAGER . . i CONTRACTORS/DESIGN ACTIVITIES PROJECT IMPLEMENTATION (Operations) is located at KSC. At each NASA center there are Project Managers responsible for the particular elements and systems. These Project Managers, in a matrix organizational arrangement, report functionally to the NSTS Program Director as well as organizationally to the center manage- ment. Reporting to the Project Managers are var- ious Subsystem Managers who are directly respon- sible for the engineering effort on their subsystems. Thus, within the center organization there are engineers and other personnel supporting the NSTS Program. Management levels within the NSTS Program are referred to as "Leve! I, Level If", and so on according to the hierarchy shown in Figure 3-~. Each level of management has a specific scope of responsibility, as described in the figure. Basically, Level ~ is Headquarters, primarily concernec! with policy and broad program formulation and man- agement; Level Il is the major program manage- ment level; and Level Ill is the project management level. The Level ~ Program Director is at Head- quarters, and reports to the Associate Aciministra- tor for Space Flight. Level IT for clevelopment resicles at JSC (viz., the Deputy Director EProgram]) and at KSC for operations (the Deputy Program Direc- tor fOperations]), while Level TIT is dispersed across all of the participating NASA centers. LEVEL 1: TOP LEVEL PROGRAM REQUIREMENTS, BUDGETS AND SCHEDULES. CONTROL OF CHANGES ABOVE S1 MILLIONNEAR OR TWO MILLION TOTAL OR THOSE IMPACTING LEVEL I REQUIREMENTS OR SCHEDULES. LEVEL Il: MANAGEMENT AND INTEGRATION OF ALL ELEMENTS OF THE PROGRAM. INTEGRATED FLIGHT AND GROUND SYSTEM REQUIREMENTS, SCHEDULES AND BUDGETS; CONTROL OF PROJECT INTERFACES; CONTROL OF CHANGES EXCEEDING PROJECT BUDGETS, OR THOSE IMPACTING LEVEL II REQUIREMENTS, INTERFACES, OR SCHEDULES. LEVEL lil: PROJECT ORIENTED FLIGHT AND GROUND SYSTEM REQUIREMENTS, SCHEDULES, AND BUDGETS; CONTROL OF CHANGES WITHIN PROJECT LEVEL BUDGETS, SCHEDULES, AND SPECIFICATIONS. LEVEL IV: DETAILED FLIGHT AND GROUND SYSTEM REQUIREMENTS WITHIN ASSIGNED PROJECT. CONTROL AND IMPLEMENTATION OF DETAILED DESIGN. FIGURE 3-, National Space Transportation System Program management relationships (after NASA). 16

OCR for page 15
3.2.2 Review Boards Office; and, to a lesser extent, the operations organizations (i.e., the Astronaut Office ant! Mis- sion Operations Directorate). Each of the management levels has associated with it one or more hoards or panels that review and approve or disapprove the actions proposed by technical and other groups at the levels below. The most important of these boards are the two Program Requirements Control Boards (PRCBs). One PRCB is at Level T! ant] the other at Level I, chaired respectively by the NSTS Deputy Director (Program) and the NSTS Program Director. These boards meet together to review FMEA/CILs. The main Level TIT boards are the Configuration Control Boards (CCBs), one for each STS element and the two launch sites (KSC and Vancienburg AFB); each of the CCBs is supporter! by a number of Config- uration Control Panels (CCPs). (See Figure 3-2.) Each of these boards ant! panels has controlling authority for "dispositioning" (cleciding upon or recommending) proposer! changes to its documen- tation, hardware, and software to the extent that the change does not conflict with requirements, schedules, budgets, etc., established by a higher- leveT board. Level TI/l PRCB approval is required for all changes to flight hardware after delivery to NASA and for all changes to flight hardware that interfaces with GSE. There are a considerable number of other Level Il and Ill boards that are responsible for review of specific technical and management aspects of STS design, development, ant] operation. All of them feed, ultimately, through the Level IT/] PRCBs, which are the highest boards for configuration control. These boards and their functions (some of which are shown in Figure 3-2) will be described further in Section 3.3, and from a different stancI- point in Section 5.10.~. 3.3 ORGANIZATIONAL ROLES As was noted in Section 3.l, in theory, safety in all its forms is equally the responsibility of all NASA managers and workers, as well as those of their contractors. In practice, roles ant] responsi- bilities are necessarily definer] ant! allocated across various functional organizations. Within the NSTS Program, these safety-relatec] roles are shared by the engineering organizations in the project offices; the Safety, Reliability, Maintainability, and Quality Assurance (SRM&QA) organization at Heaclquar- ters and the corresponding SR&QA organizations at the centers; the NSTS Engineering Integration 3.3.1 Engineering Project Offices The engineering organization within each ele- ment project office at the centers is responsible to a Project Manager ant! the Program Director for the performance anc! reliability of harc~ware/soft- ware systems they develop. Safety is thus an in- herent feature of the system design, development, testing, anal production processes. Since it is engi- neers who design the unit or system, test it, certify it for operation, and inspect it after flight, it is they who have the greatest ability to understand and anticipate the ways in which the unit or system might fail. For that reason, NASA engineers have primary responsibility for carrying out the most technical of the safety analyses describer! in Section 3.4 (i.e., the Failure Modes and Effects and Analysis EFMEA]) and for establishing the rationale for retaining critical items iclentifiecl through the FMEA. They participate seconciarily in other safety analysis efforts. However, few of the engineers have any formal grounding in safety engineering techniques and methodologies. 3.3.2 Safety, Reliability, Maintainability, and Quality Assurance Safety, Reliability, ant! Quality Assurance (SR&QA) Offices (the maintainability function was actUed at Headquarters in 1986) have long existed in one form or another within the various NASA centers as staff organizations reporting to the center director. (See Figure 3-3, for example.) The cor- responding Headquarters organization has existed as a policy-setting group reporting, until 1986, to the NASA Chief Engineer. Center SR&QA staff are cletaile`1 to programs such as the NSTS Program, where they develop functional units of staff dedicated to various aspects of Safety, Reliability, anc! Quality Assurance.7 Their role is to provide oversight of the engineering design and clevelopment activities, and to advise the Pro- ject Manager and the various configuration control boards on the safety and other relevant aspects of systems under review. They are also responsible 7 The center SR&QA organizations have, as of the time of writing, not adopted the "M" in their organization name. We have elected to adhere to current NASA practice to avoid confusion.

OCR for page 15
In a: Is es at a ~ ~ o Z I In at ~ C] C' ~ ~ 0 5 ~ O ~ ~ I,] O Z ~ o At fir o CL - 1 uz) 1 ~ 5C l l r C] G o ° ~ J Z ~ O O ~ ~ C,' ~ Z In ~ O Z A C] _ ~ O O ~ ,,, m ~ Ct J 0= m z ~ 0 0 O In He o - cn ~ ~ ~ 3 >. ~ ~ In Z ct In o z ~ - z ~ z ~ _ ~n :E ~n ~ cn Z 0 z ~ m 0 ~ a z ~ ~ 0 0 ~ ~ 18 _ LU 1 cn o m J o E z o z o cr: z o - cn 3 ~: m ~n z Z ,,, 2 et ~ cn 1 CD a) Ct CO o m ~o ~ { ~ ’ ~ O aS .= ~4 o cn 0 I cn z y ~ - ~ m o ~: o cn IL a u) ~ - 2 a) - . _ a CC 0 ° ° ~ a cn a: m 1— Z o CL I a z ~ cn cn a: - m a: o au CO C~ C~ LU C)

OCR for page 15
- NASA Of FICE INSPECTOR GENERAL ~ ~ Of THE flEID OFFICE DIRECTOR OP RTUNITY PER KNEE | PROGRAMS OFFICE | | OFFICE | , 1 . 1 PU IC AFFAIRS LEGAL OFFICE Of FICE .. SPACE STATION PROGRAM OFFICE , 1 , SPACE STATION PROJECTS OFFICE 1 1 .. 1 .. 1 ' 1 1 1 . 1 DIRECTOR, MISSION OP' RATIONS . DIRECTOR, FLIGHT CREW OPf RATIONS DIRECTOR, MISSION SUPPORT DIRECTOR, E NGINE ~ RING DIRECTOR, SPACE ~ LIFE SCIE NCES DIRECTOR, ADMINISTRAllON FIGURE 3-3 Organization of NASA Johnson Space Center (NASA). for keeping records on problems and anomalies encountered in the development and operation of the STS. SR&QA, through its Safety Divisions, has pri- mary responsibility for conducting hazard analyses of the STS (see Section 3.4.2 for a description). This is one of the most important safety-related analyses conducted on the STS, in many ways complementing the FMEA. In the wake of the Challenger accident, the functions and authority of SR&QA were expanded in scope, and the Headquarters organization was restructured. A new position of Associate Admin- istrator for SRM&QA was established, with appeal rights to the Administrator of NASA on any de- cision relevant to the safety of the STS and its crew. The new Associate Administrator intends to establish the SRM&QA function as an effective check and balance to the overall NASA operation, one that will provide a "second-Iook assessment" of the entire process from design through opera- tions. Figure 3-4 depicts the new SRM&QA or- ganization at Headquarters. 3.3.3 Engineering Integration Office The NSTS Engineering Integration Office is lo- catec] at ISC, where it handles certain special aspects 19 DIRECTOR, CENTER OPE RAT IONS . MANAGf R WHITE SANDS TEST fACllllY of STS design and development that are crucial to the safe functioning of the overall system. These include: systems integration and interface design between the different STS elements, analyses of integrated structural loads and thermal effects, software requirements and configuration control, and ground systems and operations requirements. Shuttle avionics and ascent flight systems two systems involving electronics ant] software func- tions which cut across various STS elements—are also among the responsibilities of this office. The organization of the office is shown in Figure 3-5. Note that the figure identifies a separate review structure for systems integration and software. The Systems Integration Review (SIR) Board is a Level TI board that supports the I=eve! I] and 1[ PRCBs in all the integration areas, including ascent and entry, flight control, and thermal design. The Shuttle Avionics Software Control Board (SASCB) is the controlling authority for avionics software. Addi- tionally, a Mission Integration Control Board (MICB), shown in Figure 3-2, is the controlling authority for changes to clelegatecl mission integra- tion requirements that do not affect other Level I] requirements, budgets, or schedules. The Engineering Integration Office is also re- sponsible for carrying out a series of Element

OCR for page 15
ASSOCIATE ADMINISTRATOR FOR SAFETY, RELIABILITY, MAINTAINABILIU QUALITY ASSURANCE DEPUTY ~ SPACE FLIGHT SAFETY PANEL SUPPORT STAFF DEPUTY FOR SYSTEMS ASSURANCE REND ANALYSIS SYS E S ASSESSMEN DIVISION 1 I DIVISION RELIABILllY, MAINTAINABILIU SAFE ~ QUALITY ASSURANCE DIVISION CENTER PROGRAMS ASSURANCE SAFETY, RELIABILITY DIVISION QU4U~ ASSURANCE DIRECTORATES FIGURE 3-4 Organization of the new office of Safety, Reliability, Maintainability, and Quality Assurance at NASA Headquarters (NASA). Interface Functional Analyses (ElFA), described in Section 3.4.3 below. 3.4 SAFETY ANALYSES 3.4.1 The Failure Modes and Effects Analysis and Critical Items List At the heart of NASA's effort to ensure reliability of the Shuttle system is the Failure Modes and Effects Analysis. FMEAs are performed on all STS flight hardware as well as Grounc! Support Equip- ment which interfaces with flight hardware at the launch sites to identify hardware items that are critical to the performance and safety of the vehicle and the mission, and to identify items that do not meet design requirements. (NASA floes not perform FMEAs on software; also excluded from the FMEA by definition are STS primary structure and, orig- inally, pressure vessels.) This analysis, carried out by the element contractor, begins with an identi- fication of the functional units of each system and a determination of the potential mocles of failure for each unit. Each possible failure mode is then analyzed to determine the resulting performance of the system ant! to ascertain the worst-case effect that could result from a failure in that mocle. All the identifier! items are then categorized according to the worst-case effect of the failure on the crew, the vehicle, and the mission. Table 3-l shows the FMEA/CTE criticality cIas- 20 sifications, which are based on severity of effect. Items in the top four categories Criticality I, JR, 2, and 2R comprise a Critical Items List (CIL). Essentially, this is a listing of all hardware items and their failure modes which do not meet certain design and reliability requirements (relatecl to safety) set for the Shuttle system by Level ~ management. Those requirements (specified in DISC 07700, Vol. I, Appendix A, pare. 2.~) are as follows: · "Redunclancy requirements for all flight ve- hicle subsystems . . . Dwith specific exceptions] . . . shall be established on an inclividual basis, but shall be no less than fail-safe. · "Redundant systems shall be designed so that their operational status can be verifier] cluring ground turnaround ant! to the maximum ex- tent possible while in flight." Therefore, in acldition to single-point failures, the CIL also inclu(les items that could fad! in one mode and result in toss of the capability of redundant (backup) systems, items whose status is not readily cletectable in flight, ant! reclundant systems in which a single failure uncler certain conditions may result in loss of the total system capability. Critical items with these failure modes must be subjectec! to design improvements or to corrective action to meet the fail-safe and redundancy re- quirements, before the Shuttle can fly with them present. If that is not feasible, a waiver request

OCR for page 15
v' o o ~ - o L) o c) L~ z ~ - ~c ct ~ ~ o z ~ I,~,} z - L~ C~ - L) ~ - ~ ~ - z ~s ) oc ~ ~ I_ _ _ _ _ _ 1 z . o 1 , _ . . ' ~ ~ 1 1 {; ~ :s ^ 1 ~ ~ ~ . _ ~ ~ ~ z ~ ~ — 1 1 — ~ ~ ~n z ~ — 1 I r r I 1 1 ~ 1 1 u' ~ _ _ _ _ _ ~ l l 1 1 ~ l l z ~ ~ ~ ~ o _ i G ~ ~ _ i ~ O 1__ - ! , ~ 1 — t t' 1 — 1 1 ~ I J 1 1 1 _ _— , ~ :e > ~n ~ ~ Z ~ I ~ — . , _ ~' ~ r tC=`D j __ cn cn z CD ~: z a) C: . o o . a _ . a a . LL z Z z LU r- - - --~ 1 o~ 1 1 ~ 3 1 1 ~ ~ 1 Z ~n ~ ~ I < :r ~ ~ i I _ ~ ~ LL z ~ I _ ~__' - o Z _ ~ .~ ~, o 1 1 . ~ 1 i J U, ~ ~ I 1 ~ _ ~ 1 ~ ~ ~ 1 1 ~ ~ ~ ~, . ~ _ c: _ 1 1 o 1 _ ~ I > . ~ I 1 _ _ _ _ _ _ o ~) 3 ~ — o~ ~ ~ - 21 1 1 — 1 C G 1 ~ o 1 L., l l 1 1 a CD cn o . _ I I o n CO a) C) Q CO o . _ z o a cn a . _ a) Ct o . _ Ct N . _ Ct o U~ C~

OCR for page 15
TABLE 3-1 FMEA/CIL Criticality Classification Criticality Category 1 Loss of life or vehicle Potential Effect of Failure 1 R Redundant hardware element, failure of which could cause loss of life or vehicle 2 Loss of mission 2R Redundant hardware element, failure of which could cause loss of mission 3 All others For Ground Support Equipment only: 1S Failure of a safety or hazard monitoring system to detect, combat, or operate when required and could allow loss of life or vehicle 2S Loss of vehicle system must be submitted to NASA management to present the rationale for retaining an item that does not meet the requirements. Types of data incluclec! in this "retention rationale" include design, test, and inspection data, failure history, and operational experience. Figure 3-6 shows an example of a CIL clocu~nent, including the retention rationale. An approver] waiver must support the decision to accept the risk represented by the critical item and ensure that maintenance, test, or inspection procedures will minimize the potential for the failure to occur. Figure 3-7 depicts the review and approval process for critical items. Note that the key approval reviews are clone by the CCB and PRCB review boards describer] in Section 3.2.2. After the PRCB meets, a directive is issued that documents items for which waivers have been granted ant! lists actions assignee] by the Board. Each critical item, along with its approved waiver, is maintainer! by the NSTS Program, and any subsequent changes affecting the CIL must be approver] by the NSTS Program Director. The FMEA/CTE was originally conceived as a design tool, used to ensure the early identification and disposal of critical failure modes, as well as to support other reviews of the STS design. Since mission 51-L it is now also an operational ant] management tool, used for problem analysis, to assess the efficacy of corrective actions, to identify maintenance checkout requirements and inspection points, ant] to reflect trencis in failure history. 3.4.2 Hazard Analysis Hazard analysis is another analytical too} used to identify and, if possible, resolve hazardous conditions that could develop while operating and maintaining STS hardware and software. Hazarc] identification is performed collectively by the NSTS engineering, safety, and operations organizations. Sources of information used to identify hazards include the FMEA/CIL, as well as various design reviews, safety analyses, crew procedures clevel- opment, flight anomaly reports, ant! other sources. Hazarcl analyses thus consider not only the failures iclentified in the EMEA process, but also other potential threats posed by the environment, crew/ machine interfaces, and mission activities. There are several different types of hazard analyses, as listed in Table 3-2. A typical Hazard (analysis) Report (HR) is shown as Figure 3-8. Identifiecl hazards and their causes are analyzed by Safety Division staff of the SR&QA offices at the NASA centers (and their contractors) to find ways to eliminate or control the hazard. A hazard is said to be "etiminatecI" when its source has been removed. A "controlled hazard" is one that has effectively been controlled by a design change, the addition of safety or warning devices, procedural changes, or operational constraints. Any hazard that cannot feasibly be eliminated or controlled by these means is termed an "accepted risk", and requires review and approval by Level Ill and I! management boards and their chairmen. SR&QA maintains a closed-Ioop tracking system for hazard documentation, resolution, and approval. The basic steps in hazard processing and review are depicted in Figure 3-9 and Figure 3-10. Indicated in both of the latter figures is a Mission Safety Assessment (MSA). This is a report, prepared by the Safety Division for each STS flight mission, which provides an integrated and comprehensive assessment of all activities and hazards associated with a mission, including turnaround activities. It also provides a way to identify and "baseline" 22

OCR for page 15
SHUllIE COMICAL Ill LIST - ORBITER S~STE}I :LANDIN(; LION EMEA NO 02-1 -001 -1 REV: 0V09/82 .ASS~ : MAIN LANDIN<; GEAR ARC=: CHIT. PUBIC: 1 .P/N RI :MC621-0011 CHIT. HOW: 1 .P/N VENDOR: 1170100 MENASCO V~t~ 102 099 103 104 .QUANIIIY :2 EF=Cll~: X X X X : LEFT HAND SE(S) PL LO 00 DC) X LS :RIG~ HAND RED SCORN: ."EPA~ Em: .DES L L RIDES .Kt;L A L DOWNER APE5WED BY: - REL AN/A B N/A C-N/A ACED BY (MAMA): SAM REL .r~: MUG sir . FUNC~T1OJ: MH; I=.D C~ ME~:RS CYLIN~ - I~MPER, ~E A PASSA" OF HY~WIIC F~D IHRO~ AN ClkElCE A~;aRBS ~E ENEFGY OF IMPPaCI AND ~E ~Y N~GEN IS ll=D AS IME E:EAS1IC MEDIUM 10 ~1~; 1lIE I~NSE~X; PA=S T0 1HEIR EC~ ~Slt~l=. .FAIIURE MOI:E: SIRUC~ EAIII~ .C~E(S): SI~S =)RR0SICN. PIECE-PA~ SI=CII~ E~IIL~E. OVE~IOAD. .EE~(S) CH (A) SUI~YSTEM (B) ~ (C)MISSICN (D) C~/VE=CIE: · (A) I~C OF S~STEM EUNC1~CN. (B) ~NE. (C) N(NE. (D) ~)~IE L06S OF VEIIICIE IF MAIN S]~ FAII~S CN LANDING. .DIS~SITION & RA1IC—IE (A)~iIGN (B)T~T (C)INS~1CH (D)~III~E HIS~: (A) UNDER WORST CA5E LaKDING (FIAT STRUT) 1~ S~ IS CAPA~E OF WII~NDING C~ LAND~ AT THE ~L LAND~ ~I~ ~S ~=T OF 207,000 LBS. AND S~ SP=D OF 9.6 t~:~'1' P" ~ ~ ~D~ LANDIN(; ROLLaJT AND BR~ CQNDlllCNS, WIlff NO YI~IN<; OF THE SIREJCI~ ME~;. (B) A~NCE INCII=3 VERITilCAlICN TH~T MATE~ AND E~ WE~E l=~ ;~'l~l~= ~ A FAllaJE I^D ~'1' SE~E6M (~ MC62-0011 TABIES 10-11) ~D IBE EQUI~ I=DIN(; F~ 1~ LIFE OF EAOI LANDIlK; GEAR WI~I A SC~ E~OR OF 4.0. IlIE SIA1IC IOAD ~1~ INCtI~ A ~XI ~MP (65K PAYI=D), VE=CIE W=GElT 227 KIPS/AND A RIGHT I~J/~OI IS IBE W=ST CASE ONDlllC~S Wll~ FAILU". (C) ~ T[~V:~ INSPECI FOR ~. USE NI3E ~ro SPPORT SUSPECT A~S. AT ~[~-RAW MATE~L~ VERI~VISt~LL INSP./ID ~PA=S ~ECIICN, C~ AND EIA111~ E~OCE;S~ VE~IF. BY INS~ClICN.tIANUF., INE;TL. AND ASSY. OPERAlICtJS VE~IF. BY SH()P TRAVEZER ~PS~W6ICN E~ECIICN HlOVISICNS V=IF. NDE OF St~: AND SUB~SUREACE DEF~S VE~IF. BY INSPEC1'ICN. ~PERL;Y ~IllORED HI\NDL~G AND SIORAGE ElV~ VE~IEIED. MZ~L. AND E~IE~ ~E~ANCE: TO axrI~Acr }~S. VEI?I];I~ BY ~;P.-F~lDINGS VERIIIED BY ~JDIT 9-25-78. (D) II3RD1G ~P ~1' E~CGRAM, IffE a~ER G~ND NUI FAIT~. I~NASCO RE)ESIGNED AND ~:D F~ AII~ IO S~1~(P:l. M~L. IME SNUB~ RING P/N 1170134-1 ~S =;L~IGNED. UP~ BEARII~ 1170107-1 ~S REPIA~F~ BY A SOLrD AII~10M-B~IZE BEAR~. FIGURE 3-6 An example of a Critical Items List document (NASA). hazards (i.e., to establish their "normal" ac- ceptec!—state or level) for future flights. 3.4.3 Element Interface Functional Analysis Provision is made in NASA's risk management process for checking cross-element interface failure modes anc] effects by a number of means. One methoc! usec! is the Element Interface Functional Analysis, prepare(1 by the NSTS Engineering Inte- gration Office with the support of Rockwell Inter- national. ElFAs are analyses of various functional failure mocles that can occur at element-to-element interfaces as a result of a har(lware failure in either element. There are three ElFAs: Orbiter/ET, Or- biter/SSME, ancl Orbiter/SRB-ET. (A fourth ElFA, on ground/flight systems, is now being generatecl.) The purpose of these analyses is to correlate element hardware failures with failure mocles at the element interface to cletermine the effect on the mission, vehicle, or crew safety. ElFAs also took for failure propagation across interfaces. The ElFA activity helps to ensure that FMEA items are correctly classifiecl as to their criticality. 3.4.4 Other Analyses Providing basic input to the hazarcl analysis is a diverse group of safety analyses. NHB 5300.4 (ID- 2) describes these analyses as follows: "Safety analyses are performed at the integrated and element (STS) levels and down to the component level to assure 23

OCR for page 15
- As ~ LO tIJ J 0= ~ O _. ~ A: 3 At! O ~ Cal Lo Z Z ~ ~ ~ X J LL ~ C~ Z cr CS] J~ - _ Z LLJ CD O :~: C~ ~ Jo _ ~ =: ~) ~J r-- O ~ ~ 0~: C~ ~ L~ C~ C) _ C~ ~ CY _. _ ~ ~ Z Q J ~)—Z L~ ~—a5: 1 U' Iz L~ o C~ G ll =_ Jo IC~ ~C~ LL~ 1 r ~ Z CS ~: 3 ~ LL. C~ ~E o J ~ ~ O 'S—~ O 3 =: I— O C~ ~ C~ L~ _ 1~1 Ct oU 3 ~ ~ O Z ~ ~: =, 3 C~ z OZ 0= C~ ~ _ LL C~ ~: =) 0- =: - : Z ~ _ , ~ =m = - <= _ 00 a: 3 C~ _ Ow cn ~ J - ~: 3 d 03 Z ._ ~ ~ O _~ ~0 _ ~J J~ - - _1 ~ _ ~4 ~L, c~ _ ~ _ _= ~ (Y ~ J O CD C~ =: —~ ~ ~ C~ O _ ~ ~C~ _ mm0< J Z aS L~ O 1.~_ ~ ~ Z Z ~ ~ ~ ~ C~ ~ t.= >~ ~—Z ~ O =: C~ ~ ~ O ~ ~ ~ O ~ _ ~ ~ ~ C) Z ~ ~ ~ ~ _ ~ ~ _ 3 Z ! C ~ L`J I —4 ~ Z ~ LLJ J _ Li~ (O ~ t~ 1 _ L~ c~ ~ ~ ~— Q! ~ -_ ~ L~ ~ ~ ~ ~ C ~ ~ O—=: ~ =: = - m~ '0~ =~:= ~L~ J L~l =) L=l IL O: C~ ~ ~ ~ ~ ~ ~ ~ ~ O! ~ ~ ~ C~ 7 J ~ ~ ~ ~ ~= ~ 1 0 3 ~, =1 .1: ~ r ~ == l 1 :- ~) C Zo ~ J 1 CC ~ ~ ~ '~—~— 1 c~ ~ ~ o ~ ~ ~ =: ~: Z Z X ~ ~ _ L`-J L`JJ O=L=I~< _= —C~ LL] ~ LL8J Z 1 ": ~ ~ ~ ~ ~ _ ~ I,,,1 _ ~ Z _ ~ ~ O ~: ~ ~ ~ ~ _. LL ~ ~ ~ _ C:3 =) L~ =_ J J ~ ~ ~ ~ ~ L~l e~ 1~ ~ tS O ~ =: ~ ~ ~ ~ Z Z ~ 1~1 ~ O ~ CL L~ ~: L~ ~ J _ =: L~ Z O t~ L~ L~ t~ ~ Z C L~ O ~ ~ 24 LLJ J =: L~ C~ l CO CO z - z Z _ L~ J ~ _ ~ m ~ J ~: L~ ~ _ ~: Z C~ ~ O ~ L~ ._ O C~ == 6 CD 6 z CD a) . _ . _ . _ CO o co ct) a o Q - > o Q Q a) . _ a C~

OCR for page 15
TABLE 3-2 Types of Hazard Analyses Type of Analysis Program Phase Preliminary Hazard Analyses Fault Tree Analyses Sneak Analysis Operations Hazard Analysis Concept/design and development Concept/design and development/operations Design and development phase (when detailed de- sign available)/operations Software Hazard Analysis Design and development phase/operations Design and development phase/operations Mission Level Hazard Analysis Design and development phase/operations Mission Safety Assessment Design and development phase/operations (Source. NASA JSC) identification of hazardous conditions, hazard causes, hazard effects, hazard levels, corrective actions, and rationale fair hazard closure." An important subset of safety analyses are the systems safety analyses, cleaned as follows (in NHB 700. ~ (V3 ), System Safety): "Systems safety analyses are performed for the purpose of identifying hazards and establishing risk levels . . . in support of this concept the analyses perform five basic functions: ``a. Provide the foundation for the development of safety criteria anc requirements. "b. Determine both whether and how the safety criteria and requirements provided to engineering have been included in the designfs). ~'c. Determine whether the safety criteria and requirements created for that design have provided for adequate safety for the system. "d. Provide part of the means for meeting pre-established safety goals. "e. Provide a means of demonstrating that safety goals have been met." Two other important safety analyses are the Integrated Hazarc! Analysis (1HA) ant] Critical Functions Assessment (CFA). The NSTS Engineer- ing Integration Office, with the support of Rockwell International (the integration support contractor) produces an THA when a potential risk situation or unsafe condition is perceived, the resolution of which involves two or more STS elements. These Why Used Allows top level hazard definition by generic hazard and lends itself to expansion as the program progresses. Allows in-depth analysis of selected critical areas and relationships among events. Allows identification of latent nonfailure conditions that may allow undesired conditions or prevent desired conditions. Allows independent verification that software code imple- ments approved requirement. Allows identification of hazardous conditions during opera- tions caused by such things as out-of-sequence operation, omitted steps, and interaction of elements. Allows detailed analysis of mission events considering hard- ware, crew, ground operations, and software interactions. Allows assessment of previously conducted analyses for completeness and accuracy, provides analyses and pro- vides visibility of hazards by mission phase and event. analyses are reviewer! by the System Integration Review Board (STR), Ascribed earlier. The CFA, a one-time effort completed in 1978, examined critical functions during each mission phase and iclentified hardware anc! software changes which would improve safety. The CFA includecl certain multiple and cascading failure combina- tions; it is currently being reexamines! by Rockwell International to verify the results of the initial assessment and provide an update to the current STS configuration. 3.4.5 Overall Scope of Analyses The various analysis techniques employecl by NASA are intendecl to provide an all-encompassing approach to ensuring the design reliability and safety of the STS. Some of the techniques, princi- pally the hazard analyses and ElFA, tend to be "top-clown" approaches that examine certain cross- systems causes and effects. Others, such as FMEA/ ClI=, are narrower "bottom-up" analyses that pur- sue a specific event to its conclusion but only with respect to the piece of hardware involvecI. In a briefing to the Committee, Rockwell International presented its view of this interaction, summarized in Figure 3-~. The FMEA/CTE, ElFA, and other safety analyses feed into the various hazard analyses in a one-way flow culminating in the Mission Safety Assessment. 25

OCR for page 15
as - ~ u] ~4 u) N :c Z; A: A: :~: U) _ C C: V] ~ o S E" ~ P. lo l ·- u3 m u, 0 -: · 0 0 Z ~ V) :S: ~ ~ it: I-1 tD 5: a, JO Z ·. A: ~ a 0 c o - c o c .. a o it: a u - ~ .. O ~ s: :> us Ed U. m ~ UO ~ _ 1 o_ of ~ ~ I Can ~ ~ - _ - ~m Van 0 1 0 ~ ~ ~ - ~ ~ ~ ~ I k~ ~d ~ ~ _ ~ aC ~ C I m~ I Z ~ ~ - ~ ~ ~ ~_= . ~ O O EN ~ ~ ~ 0 ' 01 U t~ · lU C C ~ C I —~ O ~ :: ~ 80 ~ == 0 3 0 I a ~v - - c c ~_ -, D: ~ ~ ~ _ ~ ~ ~ C ~ 0 ~ U~ U ~V ~ U~ ~ ~= ~ ~ N 0 0= ~ 0 - =~C U I a1: t-~ ·— CI~ ~ ~ C E O 0 - ~ ^= ~ 0 ~ E 0 - - V ~ ~ ~ ~ - ~V ~o~ CC~ O _ V~ U ~ ~ O :~= C O ~ · 8, V :' ~ · Ld C V O ~ c~ . _ ~ ~ ~= E C ~ 0 ~ ~ ~ ~ I ~ U 0 1 tt' ~ ~ ~ {) ~ 0 ~ ~ V ~ .Q E ~ ~ U. _. ~ ~ ~ _ ~ - ~ ~ C ~ O C Z :^ U ~ E O C E ~ 0 ~ ~ ~ ~ ~ U ~d ~1 LO ~ ~ ~ ~ ~ ~ ~ C ~V ~ O >— ~= ~ ~ ~ u ~ O ~ — E ~ c~ ~ ~ o: _ - CS ~ :~= 0 ~ ~ - _ ~ 0 0 C~ ~ ~ ~ ~ - U ~ V ~_ V ~1 ~ ~ ~— ~ C —aC ~ V 0. ~ ~ ~ ~ ~ · 0 3 ~ O ~—~ S tu O ~ ~ O ~~ ~ ~ U ~ ~ ~= >~> u~ ~ ~ ~ ~ ·= ~—V~ h4 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ C ~ ~ 3 >4 ~ ~ a__ ~ _ - > - O ~ O ~= ~ eQ ~ C 0 ~S O ~ - O V ~ ~ ~ t~ ~ = - E ~= ~ ~ ~ O~ ~ ~ - ~ ~ O C ~ ~_ ~ ~ .~ a~ "~ o -: E" ~ ~ ~J ~ C P. ~ ~ ~ ~ ~ o m~ ~ O · U) 3 ~ ^~ · 3 0 0 - · ~ ~ · ~ · ~ ~ _ ~ ~ ~ UJ E ~ U~ ~ v ~ ~—~ ~ ~~ C, ~~~ I~~ ~, _ o: N :~: O~ ~ C ~U O a N t~ U, 3 h. u) ~ 3 O ~-C ~ UU' _ O 0 ~ :: U) O oc N ~ -: C :~: E E o ~ ~ u] 2 O ~4— O E" ~ ~ ~e 3 a ~ ~ 0 z ~ 0 0 s" c~ u~ Q ~ O ~ eC — -; O ~ U! ~ 8 0 ~ ~ ~u, _ FIGURE 3-8 Excerpt from a sample Space Shuttle preliminary hazard analysis report (NASA). 26

OCR for page 15
\ REPRESENTAT IVE HAZAR D I DENT I F ICAT ION SOURCES : DES IGN ENG I N EER I NG STUDIES SAFETY ANALYSES ~ . CHANGE ~ - EVALUAT IONS / SNEAK AN ALY SES M I LESTON E REVIEWS · FLIGHT AN OMA Ll ES I N D IV I DUAL I N PUTS HAZAR D I DENT I F ICAT ION ORGAN I ZAT IONS § ELIM I NATED ~ HAZARDS -_ . CONTROLLED HAZARDS ~ ~ HAZARDS _ ACC EPTED R ISKS / FIGURE 3-9 Hazard processing steps (NASA JSC). CONFICURAT ION CONTROL BOARDS SYSTEM SAFETY SUBPANEL ? 1 SYSTEMS SAFETY REV I EW BOAR D FIGURE 3-10 Hazard analysis review process (NASA JSC). 27 M ISS ION SAFETY ASSESSMENT .. PROG RAM REQU I REMENTS CONTROL BOARD l MISSION SAFETY ASSESSMENT

OCR for page 15
it :r cD ~ / ~m J Z D I ~ - Z LIZ 28 o o Q Cat O ~ O Q ~ O > ~ ° cn \ 0 cn \ a' ~ \ — O \ \ ~ a) \ ~ ~ \ ~ a) \ ~ en \ - t - ~~

OCR for page 15
TABLE 3-3 Critical Item Review Teams Shuttle Element Prime Contractor Independent Review Contractor Orbiter (JSC) Rockwell International, STS Division McDonnell Douglas Astronautics Co., Houston Division External Tank (MSFC) Martin Marietta, Michoud Aerospace Rockwell International, Space Div. Transportation Systems Division Solid Rocket Motor (MSFC) Morton Thiokol, Inc., Wasatch Martin Marietta, Denver Aerospace Division Operations Solid Rocket Booster (MSFC) United Technologies Corp., United Martin Marietta, Denver Aerospace Division Space Boosters, Inc. . . Space Shuttle Main Engine Rockwell International, Martin Marietta, Denver Aerospace Division (MSFC) Rocketdyne Division (Source: NASA) As a practical matter (as cliscussed in Sections 5.l and 5.3) the FMEA/CIL, with its retention ration- ale, appears to be the dominant analysis, on which the waiver and some of the engineering change decisions are primarily based. 3.5 POST-5 1L REEVALUATION/REVIEW 3.5.1 NASA Management Directives In March 1986, soon after the Challenger acci- dent, direction was sent out from the Associate Administrator~''for Space Flight ant] the NSTS Pro- gram Director to the NSTS Project Offices to reevaluate ("re-review") the FMEAs on all critical items on the STS. The Program Director described the purpose of the reevaluation as: ". . . to affirm the completeness and accuracy of the FMEA/CIL for the current National STS design."8 Following reevaluation of the FMEA, each Criticality ~ ant] IR item, along with any new items, or items for which the reevaluation hac! lec]to a change in classification, was to be resubmitted for review anc! approval of the waiver permitting the item to be flown aboard the STS. Authority for approval of these waivers resides at the Level ~ PRCB, with the NSTS Program Director having final sign-off au- thority. Those items not revaTiciatec! by the review were requires! to be redesigned, certified, and qualified for flight. In addition to the FMEA/CIL reevalua- tion, the directives stipulates} that the hazarc] analy- ses and ElFAs also be reviewed. G 8 Memorandum of March 13, 1986. 29 3.5.2 Process FMEA/CIL. Each NSTS project and its prime contractor carried out the FMEA/CTE reevaluation, usually doing two separate reviews. In addition, independent contractors not otherwise involved in working on that element were selected to conduct parallel reviews of the FMEA/ClI for each element ant] to report the results of their assessments to NASA's review team. These independent reviews emphasizer} any analysis results that clifferec} from those identified by NASA or the element prime contractor. The FMEA/CIL review participants are listecl in Table 3-3. The processing flow for the reevaluation initially varier! somewhat from center to center, but was essentially like that shown in Figure 3-12 (from JSC). During the reevaluation, special effort has been directed to identifying design enhancements, operational ant] procedural checkout changes, or software aciclitions that recluce the criticality anal/ or minimize the chance that the potential failure mole will occur. The main difference between the re-review and the "normal review process" is the conduct of the independent reviews. Another significant difference is that the groundrules for determining Criticality ~ status were changecl: FMEAs are now carried down to the incliviclual component level (even where multiple identical components are involved), and pressure vessels (formerly excluded) are now in- cluded. These ant! other changes in procedure are specified in a new document, NSTS 22206, "In- structions for Preparation of Failure Modes and Effects Analysis and Critical Items List," which

OCR for page 15
Lu al so c at, o z l - at L - LU 5 C:) Ud LO U' 6 _ Q J o = +_____~___o, Or r 0~6 -—OZ - C) _ O ,~ 2 o C) - 1~ i~ = ~ Z o at: A, ~ ~ ~ ~ ~ ~ Z C, C: ~ o o ~ . ~ _ 1 - _ ~ ~ U ~ a Z F LO x LO 30 1 _ _ _ _ _ · I I o 1 1 1 .= o ~ U - ~ - ~ ~ - ~ z o . c) co z o o C: ~ Q Z <: Cot ._ Z O -=

OCR for page 15
was issued in October 1986 to standardize the process across the program. Hazarc! Analysis. A similar review of all ele- ment and integrated system-level hazarc] analyses is being undertaken in response to the Challenger accident. As in the case of FMEA/CTE, each project office, its prime contractor, and the indepenclent contractor are evaluating all hazard analyses and Hazarc] Reports to verify their completeness and accuracy. Figure 3-13 illustrates the current review process. Each hazard analysis assessment is being con- clucted in accordance with the guidance provided in a new document, NSTS 22254, "Methociology for Conduct of NSTS Hazard Analyses." This document defines the policy ant] procedures re- quired for preparing hazard analyses, Hazarc! Re- ports, and Mission Safety Assessments. The current review consists of a technical safety evaluation of the source material user] for all analyses, studies, and investigations conducted from the beginning of STS flight. Each subsystem as- sessment is expected to ensure that all hazards have been identified, that dispositions are accurate, and that iclentifiec! risks are acceptable. 3.5.3 Relation to Engineering Redesign Activity Since the mission 51-L accident, a substantial number of engineering changes have been under- taken to improve Shuttle safety prior to resumption of flight. Shortly after the Challenger accident, groups representing various organizational ele- ments of NASA (design centers, Astronaut Office, etc.) presenter] the NSTS Program Director with lists of items which they consiclerec] as needing attention. All were Criticality ~ or IR items. From these lists, a special Level TI senior management PRCB known as the System Design Review Boarc! recommender! the selection of 90 items (consisting of hardware, software, and procedures to undergo redesign, test, or analysis before the next flight of the Shuttle. Other items were categorized as near- term and "opportunity" actions. Since that time, the number of mandatory next-flight changes across the STS system has grown to 159. The redesign activity has, for the most part, preceded the FMEA/CIL and hazard analysis re- evaluations. Relatively few of the early items iclen- tified for next-flight change derives! from the re- evaluation activity. However, as the reevaluations proceeded they flip disclose a number of items which are being worked before the next flight. FMEA/CILs and hazarc! analyses are being gener- ated for all STS elements and modifications. The PRCB constitutes itself as the System Design Review Board to review all waiver recommendations on . . . . critical items. 3.5.4 Relation to Flight Readiness Process The results of the various safety-relatecl analyses feed into the flight review and readiness processes. By the time of the Design Certification Review (DCR), three months before launch, all FMEA/CIL waiver decisions, Hazard Reports, and the Mission Safety Assessment are available for review by the relevant readiness review boards. DOCUMENTS | REVIEW TEAMS | NASA REVIEW AND APPROVAL l 1 1 1 1 l _ NASA / PRIME l I — CONTRACTOR I EXISTING l , l ELEM ENT . _ HAZARD I _ ANALYSES l 1 ELEM ENT PROJ ECT _ OFFICE . . _ INDEPENDENT L I CONTRACTOR F—| — —— 1 1 1 SENIOR SAFETY _ PROGRAM OFFICE/ REVIEW _ HEADQUARTERS BOB tRD _ _ _ - _ . ~ . J FIGURE 3-13 Steps in the current hazard analysis reevaluation process (NASA). 31

OCR for page 15
3.5.5 Data Input and Output Among the most important types of data for use in cleveloping anti updating the CIL retention rationale and conclucting hazard analyses is feed- back from actual use of the hardware. STS equip- ment tests, preflight checkout, postflight inspec- tions, and inflight operational experience and data are all crucial sources of this type of data. NASA uses a number of special reports and reporting systems to collect and integrate such data. They include the following, whose names are self-ex- planatory: · Problem Reporting and Corrective Action (PRACA) System · Problem Reports (PRs) · Discrepancy Reports (DRs) f for software] · Unsatisfactory Condition Reports (UCRs) · Failure Reports The PRACA system is a large, distributed ciata base (one for each STS element and one for KSC ground support equipment) that contains all of the reports listed above, along with ciata on corrective actions taken. PRACA is the basis for many design changes. Problems fount] in a postflight assessment are logged into the PRACA system at the design center for that element, and all problems are tracked by ISC/NSTS via a flight anomaly report, or Failure Report. The Failure Report is cross-correlated with the FMEA/CTE number. Steps are being taken to ensure that the results of safety analyses are available to NASA managers in a more thorough ant] timely fashion. For ex- ample, NASA is setting up a closect-Ioop accounting and review system, by which all Criticality I, IR, 6 and I S items are being tied to problem reports and their resolutions. This new System Integrity Assur- ance Program (SlAP), being developed uncler the NSTS Engineering Integration Office, is intenciec! to ensure that STS flight and ground systems retain their design performance, reliability, and safety. It draws on the FMEA/CTE, hazard analyses, and other existing safety analysis systems. A major component of the STAP is its Program Compliance Assurance Status System (PCASS)- essentially a computer-based management infor- mation system. The PCASS will serve as a central data base integrating a number of existing infor- mation systems and sources across the NSTS. For example, the PRACA will be a part of it, facilitating the reduction and presentation of data on flight anomalies. It will provide in near real-time, to users such as the participants in Flight Readiness Re- views, an integrated view of the status of problems with the STS, inclucling trends, anomalies anc! deviations, and closure information. One of the major advantages of PCASS is that it wit! give SR&QA staff an easy route of access into the entire system of data bases clearing with the STS. Even- tually, it will provide automated information on critical item status and hazarc! data, with a com- puterized FMEA planner! as one of the inputs. NASA Headquarters SRM&QA is also planning an extensive system for the documentation, re- porting, review, anc! assessment of safety infor- mation. The NASA Safety Information System (NSIS) and the Shuttle Hazards Information Man- agement System (SHIMS)—an STS hazards data base are two examples. These input and output mechanisms provide the essential connectivity of the safety analyses to the continuing clevelopment, improvement, and oper- ation of the STS within the NSTS Program. 32