Functional Safety Standards
Functional safety standards provide systematic frameworks for developing electronic systems where failure could result in harm to people, property, or the environment. These standards define processes, methods, and requirements that help engineers design, implement, and verify systems capable of achieving and maintaining safe states even when components fail or software malfunctions.
The proliferation of electronic control systems in vehicles, aircraft, medical devices, and industrial machinery has made functional safety a critical discipline within embedded systems engineering. Understanding these standards is essential for engineers working on safety-critical applications, as compliance is often legally mandated and always ethically imperative.
Foundations of Functional Safety
Functional safety focuses on the correct functioning of safety-related systems in response to their inputs, including how they behave when faults occur. Unlike general reliability engineering, which aims to minimize failures, functional safety specifically addresses how systems detect, manage, and respond to failures to prevent hazardous outcomes.
Key Concepts
Safety function: A function implemented by a safety-related system that is intended to achieve or maintain a safe state of the equipment under control. Safety functions might include emergency shutdown, overspeed protection, or collision avoidance.
Safety integrity: The probability that a safety-related system will satisfactorily perform the required safety functions under all stated conditions within a specified period. Higher safety integrity means lower probability of dangerous failure.
Safety Integrity Level (SIL): A discrete level representing a range of safety integrity values. Most standards define four levels, with SIL 4 or equivalent representing the highest integrity and most stringent requirements.
Systematic failures: Failures related to design errors, specification mistakes, or process deficiencies that occur deterministically under specific conditions. These cannot be predicted statistically and must be prevented through rigorous processes.
Random hardware failures: Failures that occur unpredictably due to physical degradation, wear, or environmental stress. These can be characterized probabilistically and managed through redundancy, diagnostics, and component selection.
The V-Model Development Lifecycle
Functional safety standards universally employ variations of the V-model development lifecycle. This approach pairs each development phase with a corresponding verification or validation phase:
The left side of the V progresses from high-level requirements through system architecture, detailed design, and implementation. The right side mirrors this with unit testing, integration testing, system testing, and validation against the original requirements.
Each phase produces documented work products that undergo review and approval. Traceability links requirements through design to implementation and back through testing, ensuring complete coverage and enabling impact analysis when changes occur.
Risk Assessment and Hazard Analysis
All functional safety standards require systematic identification and analysis of hazards. Common techniques include:
Hazard and Operability Study (HAZOP): A structured examination using guide words to identify deviations from design intent and their consequences.
Failure Mode and Effects Analysis (FMEA): A bottom-up approach that examines how component failures propagate through the system to cause hazardous events.
Fault Tree Analysis (FTA): A top-down approach that starts with a hazardous event and traces backward to identify combinations of failures that could cause it.
The results of hazard analysis determine required safety functions and their integrity levels, forming the foundation for all subsequent development activities.
IEC 61508: The Foundation Standard
IEC 61508 "Functional Safety of Electrical/Electronic/Programmable Electronic Safety-related Systems" serves as the parent standard from which most industry-specific standards derive. Published by the International Electrotechnical Commission, it provides a generic framework applicable across industries and serves as the reference when no sector-specific standard exists.
Structure and Scope
IEC 61508 consists of seven parts covering the complete safety lifecycle:
Part 1: General requirements establishing the overall framework, including safety lifecycle phases and documentation requirements.
Part 2: Requirements for electrical, electronic, and programmable electronic systems, addressing hardware development and random failure quantification.
Part 3: Software requirements specifying development processes, methods, and verification activities for safety-related software.
Part 4: Definitions and abbreviations providing a common vocabulary for functional safety.
Part 5: Examples of methods for determining safety integrity levels, including risk graphs and safety layer matrices.
Part 6: Guidelines on the application of Parts 2 and 3, providing practical implementation advice.
Part 7: Overview of techniques and measures, presenting a catalogue of methods with recommendations for each SIL.
Safety Integrity Levels
IEC 61508 defines four Safety Integrity Levels based on the target probability of dangerous failure:
SIL 1: Probability of dangerous failure per hour between 10^-6 and 10^-5 for continuous operation, or between 10^-2 and 10^-1 per demand for low-demand systems.
SIL 2: Probability of dangerous failure per hour between 10^-7 and 10^-6 for continuous operation, or between 10^-3 and 10^-2 per demand.
SIL 3: Probability of dangerous failure per hour between 10^-8 and 10^-7 for continuous operation, or between 10^-4 and 10^-3 per demand.
SIL 4: Probability of dangerous failure per hour between 10^-9 and 10^-8 for continuous operation, or between 10^-5 and 10^-4 per demand.
Achieving higher SILs requires increasingly rigorous development processes, more extensive verification, greater diagnostic coverage, and often hardware redundancy.
Hardware Safety Integrity
IEC 61508 Part 2 specifies quantitative requirements for hardware safety integrity using several metrics:
Safe Failure Fraction (SFF): The proportion of failures that are either safe or detected. Higher SFF indicates better fault detection capability.
Hardware Fault Tolerance (HFT): The number of faults a system can tolerate before a dangerous failure occurs. A system with HFT of 1 can survive a single fault.
Diagnostic Coverage (DC): The effectiveness of diagnostics in detecting dangerous failures. Higher DC reduces the probability of undetected dangerous failures.
Tables in the standard specify minimum SFF requirements based on target SIL and hardware fault tolerance, guiding architectural decisions for safety-related hardware.
Software Safety Requirements
IEC 61508 Part 3 recognizes that software failures are inherently systematic since software does not degrade or wear out. Consequently, the standard focuses on process rigor and verification thoroughness rather than probabilistic metrics.
Key software requirements include:
Software safety lifecycle: A defined sequence of activities from specification through decommissioning, with specific deliverables and reviews at each phase.
Coding standards: Requirements for programming language subsets, coding guidelines, and restrictions on potentially dangerous constructs.
Verification activities: Requirements for code review, static analysis, and testing that increase in rigor with higher SILs.
Tool qualification: Requirements to ensure development tools do not introduce errors or fail to detect them when they should.
ISO 26262: Automotive Functional Safety
ISO 26262 "Road vehicles - Functional safety" adapts IEC 61508 specifically for passenger vehicles up to 3,500 kg. First published in 2011 and updated in 2018, it addresses the unique challenges of automotive development including high-volume production, extended service life, and maintenance by non-experts.
Automotive Safety Integrity Levels
ISO 26262 introduces Automotive Safety Integrity Levels (ASIL) ranging from A (lowest) to D (highest), plus QM (Quality Management) for non-safety-related functions. ASIL determination considers:
Severity (S): The potential harm from a hazardous event, from no injuries (S0) through life-threatening injuries with survival probable (S2) to life-threatening injuries with survival uncertain (S3).
Exposure (E): The probability of being in an operational situation where the hazard could occur, from incredible (E0) to high probability (E4).
Controllability (C): The ability of drivers or other traffic participants to avoid harm, from generally controllable (C0) to difficult to control or uncontrollable (C3).
A risk matrix combining these factors determines the ASIL, with ASIL D corresponding roughly to SIL 3 in IEC 61508 terms.
Structure of ISO 26262
The standard comprises twelve parts addressing the complete automotive safety lifecycle:
Parts 1-2: Vocabulary and management of functional safety, establishing terminology and organizational requirements.
Part 3: Concept phase covering hazard analysis, risk assessment, and safety goal definition.
Part 4: Product development at the system level, including technical safety concept and system design.
Part 5: Product development at the hardware level, covering hardware design, FMEA, and random failure quantification.
Part 6: Product development at the software level, specifying software development processes and methods.
Parts 7-8: Production and operation, covering manufacturing and field monitoring.
Part 9: ASIL-oriented and safety-oriented analyses including dependent failure analysis.
Part 10: Guidelines on ISO 26262, providing interpretation and application guidance.
Part 11: Guidelines for semiconductor development, added in the 2018 edition.
Part 12: Adaptation for motorcycles.
Hardware Metrics
ISO 26262 introduces automotive-specific hardware metrics:
Single-Point Fault Metric (SPFM): The proportion of single-point faults that are either covered by a safety mechanism or have a safe failure mode. Higher SPFM indicates better protection against single faults causing hazardous events.
Latent Fault Metric (LFM): The proportion of latent faults that are either detected during operation or are inherently covered. This addresses faults that could combine with subsequent faults to cause dangerous failures.
Probabilistic Metric for Hardware Failures (PMHF): The average probability of a safety goal violation due to random hardware failures per hour of operation.
Minimum values for these metrics are specified based on ASIL level, with ASIL D requiring the most stringent targets.
Software Development Requirements
ISO 26262 Part 6 specifies comprehensive software requirements including:
Software architectural design: Requirements for modularity, hierarchical structure, and restricted interfaces to contain fault propagation.
Software unit design and implementation: Coding guidelines, language subset restrictions, and design principles appropriate to each ASIL.
Software unit verification: Requirements for code review, static analysis, and structural coverage testing with increasing rigor for higher ASILs.
Software integration and verification: Requirements for integration testing with hardware and other software components.
Tables throughout Part 6 specify methods and their recommendation level (highly recommended, recommended, or no recommendation) for each ASIL, allowing engineering judgment in method selection.
ASIL Decomposition
ISO 26262 permits ASIL decomposition, where a safety requirement at a higher ASIL is allocated to redundant elements, each developed to a lower ASIL. For example, an ASIL D requirement might be decomposed into two independent ASIL B(D) requirements.
Decomposition requires demonstrating sufficient independence between the redundant elements to prevent common-cause failures. When properly applied, decomposition can reduce development costs while maintaining overall safety integrity.
DO-178C: Aerospace Software Safety
DO-178C "Software Considerations in Airborne Systems and Equipment Certification" governs software development for civil aviation systems. Published by RTCA (formerly Radio Technical Commission for Aeronautics), it provides the means of compliance with certification requirements established by aviation authorities worldwide, including the FAA and EASA.
Software Levels
DO-178C defines five Design Assurance Levels (DAL) from A (highest) to E (lowest), based on the contribution of software failure to aircraft-level hazards:
Level A: Catastrophic failure condition - would prevent continued safe flight and landing. Examples include flight control computers and engine control systems.
Level B: Hazardous/Severe-Major failure condition - would reduce capability of aircraft or crew ability to cope. Examples include autopilot systems and navigation displays.
Level C: Major failure condition - would result in significant reduction in safety margins. Examples include communication systems and weather radar.
Level D: Minor failure condition - would not significantly reduce safety but increase crew workload. Examples include passenger entertainment systems.
Level E: No effect on safety. Such software requires no DO-178C compliance.
Objectives-Based Approach
Unlike prescriptive standards, DO-178C specifies objectives that must be satisfied rather than mandating specific methods. This objectives-based approach provides flexibility in how compliance is achieved while maintaining consistent safety outcomes.
The standard defines objectives across software development processes:
Software planning: Objectives for establishing development standards, defining processes, and documenting development environment.
Software development: Objectives for requirements development, design, coding, and integration.
Software verification: Objectives for reviews, analyses, and testing at each development phase.
Configuration management: Objectives for controlling and tracking software configuration throughout development.
Quality assurance: Objectives for ensuring process compliance and work product quality.
Tables specify which objectives apply at each software level, with Level A requiring satisfaction of all objectives and Level D requiring only a subset.
Structural Coverage
DO-178C is notable for its structural coverage requirements, which mandate testing that exercises specific portions of the code structure:
Statement coverage: Every executable statement must be executed at least once. Required for Level C and above.
Decision coverage: Every decision (branch point) must take both true and false values. Required for Level B and above.
Modified Condition/Decision Coverage (MC/DC): Every condition within a decision must be shown to independently affect the decision outcome. Required for Level A.
MC/DC is particularly rigorous, requiring that for each condition in a compound Boolean expression, tests exist showing the condition can change the overall result while other conditions remain fixed.
Supplements and Tool Qualification
DO-178C is accompanied by several supplements addressing specific technologies:
DO-330: Software Tool Qualification Considerations. Defines processes for qualifying development and verification tools based on their potential impact on software quality.
DO-331: Model-Based Development and Verification Supplement. Provides additional guidance when models are used as design specifications or for code generation.
DO-332: Object-Oriented Technology and Related Techniques Supplement. Addresses specific concerns with object-oriented programming including inheritance, polymorphism, and dynamic binding.
DO-333: Formal Methods Supplement. Provides guidance for using mathematical proof techniques as a complement or alternative to traditional testing.
Tool qualification under DO-330 requires demonstrating that tools do not introduce errors (Tool Qualification Level 1) or fail to detect errors they should detect (Tool Qualification Level 2).
IEC 62304: Medical Device Software
IEC 62304 "Medical device software - Software life cycle processes" specifies requirements for the development and maintenance of medical device software. Published jointly by IEC and ISO, it harmonizes with other medical device standards and regulatory frameworks including FDA requirements and the European Medical Device Regulation.
Software Safety Classification
IEC 62304 defines three software safety classes based on the hazard potential if software fails:
Class A: No injury or damage to health is possible. The software cannot contribute to a hazardous situation or can only contribute to hazards that external risk control measures make not reasonably foreseeable.
Class B: Non-serious injury is possible. The software can contribute to a hazardous situation that could result in injury but not serious injury.
Class C: Death or serious injury is possible. The software can contribute to a hazardous situation that could result in death or serious injury.
Classification considers the software's contribution to overall device risk, accounting for hardware risk control measures and the probability of software failure leading to a hazardous situation.
Development Process Requirements
IEC 62304 specifies requirements across the software lifecycle:
Software development planning: Requirements for defining development processes, deliverables, and verification activities.
Software requirements analysis: Requirements for capturing functional, performance, interface, and safety requirements.
Software architectural design: Requirements for defining software structure and interfaces, with emphasis on risk control at Class C.
Software detailed design: Requirements for specifying software units with sufficient detail for implementation and verification.
Software unit implementation and verification: Requirements for coding and unit testing.
Software integration and integration testing: Requirements for combining software units and verifying their interaction.
Software system testing: Requirements for verifying that integrated software meets requirements.
Requirements are scaled by safety class, with Class C requiring the most comprehensive activities and documentation.
Integration with Risk Management
IEC 62304 integrates tightly with ISO 14971 "Medical devices - Application of risk management to medical devices." Software safety requirements flow from the device-level risk analysis performed under ISO 14971.
Key integration points include:
Safety classification: Software safety class is determined through risk analysis considering potential hazards and their severity.
Safety requirements: Software requirements include those necessary to implement risk control measures identified in the risk management process.
Verification of risk controls: Software verification must confirm that implemented risk controls effectively reduce identified risks.
Residual risk assessment: Evaluation of remaining software-related risks after implementation of controls.
Software of Unknown Provenance
IEC 62304 addresses Software of Unknown Provenance (SOUP), which includes third-party libraries, open-source components, and legacy software not developed under IEC 62304. SOUP is increasingly common in medical devices that incorporate operating systems, communication stacks, and other complex components.
Requirements for SOUP include:
Identification: Documenting all SOUP components including version and configuration.
Risk analysis: Evaluating potential failure modes and their contribution to device-level hazards.
Evaluation: Assessing the SOUP's suitability through analysis of documentation, testing, or operational history.
Verification: Testing SOUP in the intended context to verify correct behavior.
Maintenance and Change Control
IEC 62304 includes specific requirements for software maintenance that are particularly relevant for medical devices with long service lives:
Problem and modification analysis: Evaluating reported problems and proposed changes for their impact on safety.
Modification implementation: Applying changes through the appropriate development process activities based on the change's scope and impact.
Configuration management: Maintaining complete version history and traceability throughout the product lifecycle.
Other Domain-Specific Standards
Beyond the major standards detailed above, numerous other functional safety standards address specific industries and applications.
Railway: EN 50128 and EN 50129
European standards EN 50128 "Railway applications - Communication, signalling and processing systems - Software for railway control and protection systems" and EN 50129 "Railway applications - Communication, signalling and processing systems - Safety related electronic systems for signalling" govern railway safety systems.
EN 50128 defines five Software Safety Integrity Levels (SSIL 0-4) with associated software development requirements. EN 50129 addresses system-level safety requirements and acceptance processes. Both derive from IEC 61508 while incorporating railway-specific considerations.
Industrial: IEC 62443
IEC 62443 series "Security for industrial automation and control systems" addresses cybersecurity for industrial control systems. While focused on security rather than safety, it increasingly intersects with functional safety as cyberattacks can cause safety-relevant failures.
The series defines Security Levels (SL 1-4) analogous to Safety Integrity Levels, with requirements for secure development, security assessment, and security management throughout the system lifecycle.
Nuclear: IEC 61513
IEC 61513 "Nuclear power plants - Instrumentation and control important to safety" provides requirements for instrumentation and control systems in nuclear facilities. It references IEC 61508 while adding nuclear-specific requirements for systems that must function during and after severe accidents.
Machinery: ISO 13849
ISO 13849 "Safety of machinery - Safety-related parts of control systems" provides an alternative approach to IEC 62443 for machinery safety. It uses Performance Levels (PL a through e) rather than SIL and provides simplified methods suitable for industrial machinery applications.
Common Themes Across Standards
Despite their different origins and domain-specific requirements, functional safety standards share common themes that reflect fundamental principles of safety engineering.
Risk-Based Approach
All standards base safety requirements on systematic risk assessment. The rigor of development and verification activities scales with the potential consequences of failure. This risk-based approach allocates engineering effort where it provides the greatest safety benefit.
Systematic Development Process
All standards require documented, repeatable development processes with defined phases, deliverables, and reviews. This systematic approach aims to prevent the introduction of errors and detect those that occur.
Independence and Redundancy
Higher integrity levels typically require independence between functions, between development team members, or between redundant channels. Independence limits the propagation of systematic errors and common-cause failures.
Verification and Validation
All standards require extensive verification (confirming correct implementation) and validation (confirming the right system was built). Multiple verification methods, from review to testing to formal analysis, provide defense in depth against errors.
Traceability and Documentation
Complete traceability from requirements through implementation to verification is universal. This documentation enables impact analysis, supports certification, and provides evidence that safety was systematically addressed.
Implementing Functional Safety
Implementing functional safety standards requires organizational commitment, skilled personnel, and appropriate tools and processes.
Organizational Considerations
Safety culture: Effective functional safety requires organizational culture that prioritizes safety and supports raising concerns. Management must demonstrate commitment and allocate sufficient resources.
Competency management: Standards require personnel with appropriate competencies for their roles. Organizations must assess competencies, provide training, and maintain records.
Independence: Higher integrity levels require independent assessment of safety activities. This may require separate teams or external assessment depending on the standard and level.
Process and Tool Selection
Tailored processes: Development processes must address standard requirements while remaining practical. Process tailoring balances compliance with efficiency.
Tool qualification: Development and verification tools may require qualification to demonstrate they do not compromise safety. Tool selection should consider qualification burden.
Automation: Automated verification, traceability management, and documentation generation can reduce manual effort and improve consistency.
Certification and Assessment
Self-assessment: Organizations may perform internal assessments demonstrating compliance with standards. This requires competent assessors with appropriate independence.
Third-party certification: Some industries or customers require certification by accredited bodies. Certification processes vary by standard and jurisdiction.
Regulatory approval: Safety-critical products often require regulatory approval before market entry. Understanding regulatory expectations is essential for successful approval.
Challenges and Future Directions
Functional safety faces ongoing challenges as technology evolves and systems become more complex.
Autonomous Systems and Machine Learning
Traditional functional safety standards assume systems behave deterministically according to specified requirements. Autonomous systems using machine learning challenge this assumption, as their behavior emerges from training rather than explicit programming.
Emerging guidance addresses AI-based systems in safety contexts, including requirements for training data quality, performance monitoring, and graceful degradation when operating outside trained conditions.
Cybersecurity Integration
The convergence of safety and security creates challenges for standards developed independently. New editions of safety standards increasingly reference cybersecurity requirements, and combined safety-security analyses are becoming standard practice.
Agile Development
Traditional safety standards assume sequential, document-heavy development processes. Reconciling this with modern agile practices requires careful interpretation of standard requirements and appropriate process adaptations.
Supply Chain Complexity
Modern systems incorporate components from numerous suppliers, including open-source software and commercial components. Managing safety requirements across complex supply chains requires clear interfaces, appropriate contracts, and verification of supplier contributions.
Summary
Functional safety standards provide essential frameworks for developing systems where failure could cause harm. IEC 61508 establishes the foundational concepts applicable across industries, while domain-specific standards like ISO 26262, DO-178C, and IEC 62304 adapt these principles for automotive, aerospace, and medical applications respectively.
Despite their different details, all functional safety standards share common themes: risk-based requirements, systematic development processes, verification and validation, and comprehensive documentation. Understanding both the specific requirements of applicable standards and these underlying principles enables engineers to develop truly safe systems.
As technology continues to evolve, functional safety practices must adapt to address new challenges including autonomous systems, cybersecurity threats, and complex supply chains. The fundamental goal, however, remains constant: ensuring that electronic systems protect rather than harm the people who depend on them.