Aging and Degradation Mechanisms
All electronic components degrade over time. While modern manufacturing has dramatically improved component reliability, understanding the fundamental mechanisms that cause components to wear out is essential for predicting service life, designing reliable systems, and conducting effective failure analysis. This knowledge enables engineers to select appropriate components, implement protective measures, and establish realistic maintenance schedules.
Component degradation results from a combination of intrinsic material properties and extrinsic environmental stresses. Some mechanisms operate continuously during normal operation, while others require specific conditions such as high temperature, humidity, or electrical stress. By understanding these mechanisms, engineers can design systems that minimize degradation rates and achieve their reliability targets.
Electromigration Effects
Electromigration is the transport of metal atoms in a conductor due to momentum transfer from conducting electrons. As electrons flow through a metal conductor, they collide with metal atoms and gradually push them in the direction of electron flow. Over time, this creates voids where atoms have been depleted and hillocks or extrusions where atoms have accumulated.
Physics of Electromigration
The driving force for electromigration is the electron wind force, which results from the momentum exchange between conduction electrons and metal ions. The rate of electromigration depends on several factors:
- Current density - Higher current density means more electrons colliding with metal atoms, dramatically accelerating electromigration. The relationship is typically squared (j squared dependence), making current density the most critical factor
- Temperature - Electromigration follows Arrhenius behavior, with rates increasing exponentially with temperature. The activation energy depends on the dominant diffusion path
- Grain structure - Grain boundaries provide fast diffusion paths. Large-grained or bamboo-structured conductors resist electromigration better than fine-grained materials
- Material composition - Adding small amounts of copper to aluminum, or using copper instead of aluminum, significantly improves electromigration resistance
Black's Equation
The mean time to failure (MTTF) due to electromigration is commonly estimated using Black's equation:
MTTF = A * j^(-n) * exp(Ea / kT)
Where A is a constant, j is current density, n is the current density exponent (typically 1-2), Ea is the activation energy, k is Boltzmann's constant, and T is absolute temperature. This equation shows the strong dependence on current density and the exponential dependence on temperature.
Electromigration in Modern ICs
As integrated circuit features have shrunk, electromigration has become increasingly important. Current densities in modern interconnects can exceed 1 MA/cm squared, approaching limits where electromigration becomes significant within product lifetimes. Design rules now include strict limits on maximum current density for different metal layers and via configurations.
Modern processes address electromigration through several approaches: using copper interconnects instead of aluminum (copper has approximately 5 times better electromigration resistance), adding liner and cap layers that slow diffusion, optimizing grain structure during deposition, and implementing redundant via strategies to prevent open circuits even if some vias fail.
Electromigration in Package and Board Level
While most attention focuses on IC-level electromigration, similar effects can occur at higher levels of assembly. Wire bonds, solder bumps, and PCB traces carrying high current densities can experience electromigration-induced failures, particularly in power electronics applications. Proper design must consider current density limits throughout the entire current path.
Dielectric Breakdown
Dielectric breakdown occurs when an insulating material can no longer withstand the applied electric field and becomes conductive. In electronics, this can affect oxide layers in transistors, capacitor dielectrics, insulation between conductors, and conformal coatings. Breakdown can be instantaneous and catastrophic or can develop gradually through time-dependent mechanisms.
Intrinsic and Extrinsic Breakdown
Intrinsic breakdown represents the fundamental limit of a perfect dielectric material. It occurs when the electric field accelerates electrons to energies sufficient to ionize atoms, creating an avalanche of charge carriers. Intrinsic breakdown strength for silicon dioxide is approximately 10 MV/cm.
Extrinsic breakdown occurs at lower field strengths due to defects in the dielectric. These defects can include particles, pinholes, weak spots, contamination, or interface states. Most practical breakdown events are extrinsic, occurring at a fraction of the intrinsic breakdown strength.
Time-Dependent Dielectric Breakdown (TDDB)
Even when operated well below instantaneous breakdown voltage, dielectrics gradually degrade under electrical stress. This time-dependent dielectric breakdown results from the accumulation of damage over time, eventually creating a conductive path through the dielectric.
Several models describe TDDB, including the E-model (linear field dependence), the 1/E model (reciprocal field dependence), and the power-law model. The choice of model affects reliability projections, with different models predicting significantly different lifetimes when extrapolating from accelerated test conditions to use conditions.
TDDB is particularly critical in gate oxides, where oxides have thinned to just a few nanometers in advanced process nodes. At these thicknesses, quantum mechanical tunneling current flows continuously, and the oxide must survive this stress for the product lifetime.
Oxide Breakdown Mechanisms
Several mechanisms contribute to oxide degradation:
- Trap generation - Electrical stress creates electron and hole traps in the oxide and at interfaces. These traps accumulate over time and eventually form a conductive percolation path
- Charge injection - Hot carriers can be injected into the oxide, creating damage and changing threshold voltages. This is particularly problematic in flash memory and EEPROMs
- Hydrogen release - Hot carriers can break Si-H bonds at interfaces, releasing hydrogen that can diffuse and cause damage elsewhere
- Anode hole injection - At high fields, electrons tunneling through the oxide can generate holes in the anode that are injected back into the oxide, causing damage
Capacitor Dielectric Degradation
Capacitor dielectrics also experience time-dependent degradation. Electrolytic capacitors are particularly susceptible, with the oxide film on the anode gradually degrading over time. High temperature and high ripple current accelerate this degradation.
Ceramic capacitors can experience degradation through various mechanisms depending on the dielectric formulation. High-capacitance multilayer ceramic capacitors using X5R or Y5V dielectrics show more degradation than stable C0G/NP0 types. Voltage derating is commonly used to extend capacitor life.
Whisker Growth
Metal whiskers are thin, hair-like crystalline structures that spontaneously grow from metal surfaces. Tin whiskers are the most problematic in electronics, as the transition to lead-free soldering has increased the use of pure or high-tin alloys that are prone to whisker formation. Whiskers can grow to lengths sufficient to bridge adjacent conductors, causing short circuits and system failures.
Whisker Formation Mechanisms
Whisker growth is driven by compressive stress in the metal film. This stress can arise from several sources:
- Intermetallic formation - When tin is plated over copper, copper diffuses into the tin and forms intermetallics (Cu6Sn5, Cu3Sn). This creates compressive stress in the remaining tin
- Thermal expansion mismatch - Different thermal expansion coefficients between the tin coating and substrate create stress during temperature cycling
- Mechanical stress - Bending, clamping, or other mechanical forces create stress that promotes whisker growth
- Corrosion - Oxide formation and corrosion products can create stress in the underlying metal
Whiskers grow to relieve this compressive stress, extruding from the surface through grain boundaries or weak spots in oxide layers. Growth rates vary widely but can exceed 1 mm per year under adverse conditions.
Whisker Characteristics
Tin whiskers typically range from 1 to 10 micrometers in diameter and can grow to lengths exceeding 10 mm, though most are shorter. They are single crystals of tin with excellent electrical conductivity. Their small diameter gives them significant current-carrying capacity relative to their size, allowing them to carry enough current to cause permanent damage when bridging circuit nodes.
Whiskers can be straight, kinked, or curled. They grow at varying angles from the surface and can break off due to handling or thermal shock, potentially creating conductive debris that causes intermittent failures.
Whisker Mitigation Strategies
Several strategies help mitigate whisker risk:
- Matte tin instead of bright tin - Matte tin finishes with larger grains show reduced whisker propensity compared to bright tin
- Nickel barrier layers - A nickel underplate between copper and tin prevents copper diffusion and the associated stress
- Annealing - Post-plating annealing can reduce residual stress and form stable intermetallics
- Tin alloys - Adding small amounts of bismuth, silver, or copper to tin reduces whisker growth
- Conformal coating - Coating over tin surfaces can prevent whiskers from causing shorts, though whiskers may still grow under and penetrate thin coatings
- Component spacing - Maintaining adequate spacing between conductors reduces the probability of whisker-induced shorts
Industry Standards for Whiskers
Several industry standards address whisker testing and acceptance criteria. JEDEC JESD201 describes test methods for assessing whisker growth, including ambient storage, elevated temperature and humidity, and temperature cycling tests. IPC-4552 specifies requirements for electroless nickel/immersion gold (ENIG) finishes that provide good whisker resistance. These standards help manufacturers demonstrate that their products meet acceptable whisker risk levels.
Corrosion Mechanisms
Corrosion is the degradation of materials through chemical or electrochemical reactions with their environment. In electronics, corrosion can cause open circuits by consuming conductors, short circuits through the formation of conductive corrosion products, and degraded connections at contact interfaces. Understanding corrosion mechanisms is essential for designing electronics that survive in humid, contaminated, or otherwise aggressive environments.
Electrochemical Corrosion
Most electronic corrosion is electrochemical, requiring an anode (oxidation site), cathode (reduction site), electrical connection between them, and an electrolyte (moisture film with dissolved ions). When these conditions are met, metal dissolves at the anode while reduction reactions occur at the cathode.
Galvanic corrosion occurs when dissimilar metals are electrically connected in the presence of an electrolyte. The more active metal (anode) corrodes preferentially. The galvanic series ranks metals by their electrode potential, with active metals like zinc and aluminum at one end and noble metals like gold and platinum at the other. Large cathode-to-anode area ratios accelerate galvanic corrosion.
Electrochemical Migration
Electrochemical migration (ECM) is a particularly insidious form of corrosion in electronics. Under bias in the presence of moisture, metal ions dissolve at the anode, migrate through the electrolyte, and plate out at the cathode, forming metallic dendrites. These dendrites grow toward the anode and can eventually bridge the gap, causing short circuits.
Silver is especially susceptible to ECM and can form dendrites rapidly under appropriate conditions. Copper, tin, and lead also exhibit ECM. The risk increases with closer spacing between conductors, higher voltage, higher humidity, and the presence of contamination (particularly ionic contamination like chlorides).
Atmospheric Corrosion
Atmospheric corrosion occurs when metal surfaces are exposed to ambient air containing moisture and pollutants. Key factors include:
- Humidity - Water films form on surfaces above critical relative humidity (typically 40-70% depending on contamination). These films enable electrochemical reactions
- Temperature cycling - Cycling through the dew point causes condensation, depositing water that may contain dissolved contaminants
- Pollutants - Sulfur compounds (SO2, H2S), nitrogen oxides, chlorides, and other airborne contaminants accelerate corrosion. Industrial and marine environments are particularly aggressive
- Time of wetness - The total time surfaces remain wet is a key parameter for predicting corrosion
Corrosion in Specific Materials
Different materials exhibit different corrosion behaviors:
- Copper - Forms protective patina in clean atmospheres but corrodes rapidly in sulfur-containing environments. Susceptible to stress corrosion cracking in ammonia atmospheres
- Aluminum - Forms protective oxide layer but susceptible to pitting in chloride environments. Galvanically incompatible with most other metals
- Silver - Tarnishes readily in sulfur environments, forming silver sulfide. Highly susceptible to electrochemical migration
- Gold - Excellent corrosion resistance but thin plating may have porosity allowing substrate corrosion
- Tin - Forms protective oxide but susceptible to whisker growth and can develop conductive tin oxide in certain conditions
Corrosion Prevention
Strategies for preventing corrosion in electronics include controlling the environment (low humidity, filtered air), using appropriate material combinations, applying protective coatings (conformal coatings, hermetic sealing), ensuring cleanliness (removing flux residues and ionic contamination), and designing for drainage and ventilation to minimize moisture accumulation.
Thermal Cycling Effects
Thermal cycling subjects electronic assemblies to repeated temperature changes that cause expansion and contraction of materials. Because different materials have different coefficients of thermal expansion (CTE), thermal cycling creates mechanical stress at interfaces between dissimilar materials. This stress accumulates over many cycles, eventually causing fatigue failures in solder joints, wire bonds, die attach, and other interconnections.
CTE Mismatch Stress
The strain induced by thermal cycling between materials with different CTEs depends on the temperature range and the CTE difference. For example, a silicon die (CTE approximately 2.6 ppm/K) mounted on an FR-4 PCB (CTE approximately 14-17 ppm/K) experiences significant stress during temperature cycling.
The stress is highest at the corners and edges of interfaces, where the accumulated strain from the entire interface concentrates. This is why solder ball failures in BGA packages typically begin at corner balls, and why underfill is used to distribute stress more uniformly.
Solder Joint Fatigue
Solder joints are the most common location for thermal cycling failures. The joint must accommodate the relative movement between component and board while maintaining electrical connection. Each thermal cycle causes plastic deformation in the solder, and this deformation accumulates over many cycles.
Solder fatigue life is typically estimated using the Coffin-Manson relationship, which relates cycles to failure with the plastic strain range. Factors affecting solder joint fatigue life include:
- Joint geometry - Larger standoff height allows more compliance and longer life. Smaller joints with less solder volume have shorter lives
- Component size - Larger components create larger strain ranges and shorter solder joint life
- Temperature range - Wider temperature excursions cause more damage per cycle
- Dwell time - Longer dwells at temperature extremes allow stress relaxation through creep, potentially increasing damage per cycle
- Ramp rate - Very fast temperature changes can create thermal shock that causes additional damage mechanisms
Wire Bond Fatigue
Wire bonds connecting die to package leads or substrate are also susceptible to thermal cycling damage. The bond itself, the heel of the wire where it leaves the bond, and the loop shape all affect reliability. CTE mismatch between die and substrate causes the wire to flex with each thermal cycle, eventually fatiguing at stress concentration points.
Die Attach Fatigue
The attachment between semiconductor die and package or substrate must survive thermal cycling while maintaining good thermal conductivity. Die attach materials range from hard solders to soft epoxies, with different fatigue characteristics. Voiding in die attach creates stress concentrations that accelerate failure. Large die require careful attention to die attach material selection and process control.
Thermal Cycling Testing
Accelerated thermal cycling testing uses conditions more severe than typical use conditions to precipitate failures in reasonable test times. Standards such as IPC-9701 for solder joint reliability and JEDEC JESD22-A104 for temperature cycling define test conditions. Test results are used with acceleration models to predict field reliability.
Mechanical Fatigue
Mechanical fatigue occurs when materials are subjected to repeated stress cycles, eventually failing at stress levels well below their static strength. In electronics, mechanical fatigue can result from vibration, shock, repeated flexing, or thermal cycling. Fatigue failures typically initiate as small cracks at stress concentration points and propagate with continued cycling until final fracture.
Vibration-Induced Fatigue
Vibration exposes electronic assemblies to repeated mechanical stress. Random vibration, as encountered in transportation or operation near rotating machinery, excites resonances throughout the assembly. Components and their attachments experience cyclic stress that causes fatigue damage.
Critical factors for vibration fatigue include:
- Natural frequencies - Components and assemblies have resonant frequencies where vibration is amplified. Exciting these resonances dramatically increases stress
- Transmissibility - How vibration is transmitted from the mounting point to critical components affects the stress experienced
- Component mass - Heavier components generate more inertial force and higher stress in their attachments
- Lead/terminal stiffness - Stiff leads transmit more force to solder joints; compliant leads provide isolation
Flexural Fatigue
PCBs can experience flexural fatigue from repeated bending. This is a concern in applications where boards are flexed during operation or where thermal expansion of the enclosure causes board bending. Ceramic components such as MLCCs are particularly susceptible to flex-induced cracking.
Board flexure during assembly processes, particularly during in-circuit testing or connector insertion, can initiate cracks that propagate during subsequent thermal cycling or vibration.
Creep and Stress Relaxation
Creep is the time-dependent deformation of materials under constant stress. In solder joints, creep allows stress relaxation when assemblies are held at elevated temperature. While this relaxation reduces instantaneous stress, the accumulated creep strain contributes to fatigue damage.
Creep is thermally activated and becomes significant in tin-based solders at typical operating temperatures. Lead-free solders generally have better creep resistance than tin-lead, but their higher melting points mean that operating temperatures represent a larger fraction of the homologous temperature.
Fatigue Analysis Methods
Fatigue life prediction uses material fatigue properties (S-N curves or strain-life data) combined with stress analysis to estimate cycles to failure. Miner's rule provides a simple method for combining damage from different stress levels. More sophisticated methods account for stress sequence effects, mean stress, and multiaxial stress states.
Finite element analysis (FEA) is commonly used to determine stress distributions in electronic assemblies under thermal and mechanical loading. These results feed into fatigue life calculations to predict reliability.
Radiation Effects
Radiation can cause both temporary and permanent damage to electronic components. Effects range from transient upsets that cause momentary errors to permanent degradation that renders devices non-functional. Understanding radiation effects is essential for designing electronics for space, nuclear, and medical applications, as well as for high-altitude aviation and some industrial environments.
Total Ionizing Dose (TID)
Total ionizing dose refers to the cumulative effect of ionizing radiation absorbed by semiconductor devices. Radiation creates electron-hole pairs in oxide layers. While most recombine, some holes become trapped in the oxide or at interfaces, causing threshold voltage shifts, increased leakage current, and degraded timing.
TID effects are dose-rate dependent, with some devices showing enhanced degradation at low dose rates (enhanced low dose rate sensitivity or ELDRS). This complicates testing and life prediction for space applications where dose rates are low but cumulative doses can be significant.
Single Event Effects (SEE)
Single event effects result from individual energetic particles striking sensitive regions of semiconductor devices. Types of SEE include:
- Single event upset (SEU) - A particle strike changes the state of a memory cell or flip-flop. This is a soft error that can be corrected by rewriting the correct data
- Single event latch-up (SEL) - A particle strike triggers the parasitic thyristor structure in CMOS devices, potentially causing destructive overcurrent if not quickly interrupted
- Single event burnout (SEB) - A particle strike triggers destructive breakdown in power transistors, typically power MOSFETs or IGBTs
- Single event gate rupture (SEGR) - A particle strike causes gate oxide failure in power MOSFETs
- Single event transient (SET) - A particle strike causes a transient voltage pulse that can propagate through combinational logic
Displacement Damage
Non-ionizing energy loss from particle radiation can displace atoms from their lattice positions, creating defects that degrade semiconductor properties. Displacement damage is particularly significant for bipolar devices, optoelectronics (LEDs, solar cells, image sensors), and some CMOS technologies. Heavy particles and neutrons cause more displacement damage than photons or electrons of the same energy.
Radiation Hardening
Radiation-hardened components use design and process techniques to improve radiation tolerance:
- Process hardening - Special oxide processes, epitaxial layers, and isolation structures reduce TID sensitivity
- Design hardening - Larger transistors, redundant circuits, error correction, and temporal filtering reduce SEE sensitivity
- Shielding - Metal enclosures attenuate some radiation, though they can also create secondary particles
- Component selection - Some device types and process nodes are inherently more radiation tolerant
Accelerated Life Testing
Accelerated life testing (ALT) applies stress levels higher than normal operating conditions to cause failures in shorter times. The test results are then extrapolated to predict life under use conditions. ALT enables reliability assessment of components and systems without waiting for real-time failures that might take years to occur.
Acceleration Models
Acceleration models relate test conditions to use conditions, allowing extrapolation of test results. Common models include:
- Arrhenius model - Describes thermally activated degradation. The acceleration factor is exponentially related to the temperature difference. Widely used for chemical degradation, diffusion-controlled failures, and semiconductor wear-out mechanisms
- Inverse power law - Describes voltage or current stress acceleration. Commonly used for dielectric breakdown and electromigration
- Eyring model - Extends the Arrhenius model to include additional stress factors such as humidity or voltage
- Coffin-Manson model - Describes thermal cycling fatigue. Relates cycles to failure to temperature range and other factors
- Peck model - Describes combined temperature and humidity acceleration, commonly used for moisture-related failures
Test Design Considerations
Effective accelerated testing requires careful attention to several factors:
- Failure mechanism preservation - Test conditions must accelerate the same failure mechanisms that occur in use. Overly aggressive conditions may introduce new mechanisms not relevant to actual use
- Activation energy determination - For thermally accelerated tests, the activation energy must be known or determined. Using an incorrect activation energy leads to large extrapolation errors
- Sample size - Sufficient samples must be tested to characterize the failure distribution, not just the mean life
- Test duration - Tests must run long enough to produce a meaningful number of failures. Zero-failure tests provide limited statistical confidence
- Multiple stress levels - Testing at multiple stress levels allows verification of the acceleration model and estimation of activation energy
Highly Accelerated Life Testing (HALT)
HALT is a development testing method that subjects products to progressively increasing stress to find design weaknesses. Unlike ALT, HALT is not intended for life prediction but rather for finding and fixing problems before production. HALT typically combines thermal cycling, vibration, voltage stress, and other factors to rapidly find failure modes.
HALT testing progresses through stages: cold step stress (progressively lower temperatures), hot step stress (progressively higher temperatures), rapid thermal transitions, vibration step stress, and combined environment. Operating limits and destruct limits are identified, guiding design improvements.
Life Data Analysis
Accelerated test data must be properly analyzed to estimate life under use conditions. Weibull analysis is commonly used, with parameters estimated from test data and then adjusted for the use condition using the appropriate acceleration model. Confidence bounds account for statistical uncertainty in the estimates.
Key considerations include censored data (samples that have not failed by test end), competing failure modes (different mechanisms with different acceleration factors), and degradation data (measuring parameter drift rather than waiting for functional failure).
Practical Implications
Design for Reliability
Understanding degradation mechanisms enables proactive reliability design:
- Derating - Operating components below their rated limits extends life by reducing stress on degradation mechanisms
- Material selection - Choosing materials with appropriate properties for the application environment
- Thermal management - Reducing junction temperatures slows most degradation mechanisms
- Environmental protection - Conformal coatings, hermetic sealing, and controlled environments reduce corrosion and moisture effects
- Design margins - Accounting for parameter drift over life ensures end-of-life performance meets requirements
Failure Analysis Application
Knowledge of degradation mechanisms is essential for effective failure analysis. Recognizing the physical signatures of different mechanisms helps identify root causes:
- Electromigration produces voids and hillocks in conductors
- Thermal cycling fatigue creates characteristic crack patterns in solder joints
- Corrosion leaves distinctive surface deposits and dendrite patterns
- Whiskers have characteristic crystalline structure
- Dielectric breakdown creates pinholes or carbonized paths
Correlating observed failure signatures with environmental and operational history helps confirm failure mechanisms and guide corrective actions.
Reliability Prediction
Physics-of-failure approaches to reliability prediction use degradation models to estimate life under specific use conditions. This contrasts with older empirical methods that used historical failure rate data without considering the underlying physics. Physics-of-failure methods provide better accuracy when use conditions differ from historical experience and enable design optimization for reliability.
Summary
Electronic component degradation results from multiple mechanisms acting simultaneously. Electromigration moves metal atoms in conductors carrying high current density. Dielectric breakdown gradually weakens insulators under electrical stress. Whiskers grow from stressed metal surfaces, potentially causing short circuits. Corrosion degrades materials through electrochemical reactions with the environment. Thermal cycling fatigues interconnections through repeated expansion and contraction. Mechanical stress causes fatigue failures in components and their attachments. Radiation creates both immediate and cumulative damage in semiconductor devices.
Understanding these mechanisms enables engineers to design reliable electronics, predict service life, and effectively analyze failures when they occur. Accelerated life testing provides the means to assess reliability within practical timeframes, provided that appropriate acceleration models are used and tests are properly designed to preserve relevant failure mechanisms. By applying this knowledge, engineers can deliver electronic products that meet their reliability requirements throughout their intended service life.
Related Topics
- Environmental Ratings - Standards and specifications for operating conditions
- Protection Devices - Components that safeguard against damage
- Thermal Management Components - Solutions for controlling component temperature
- Fundamental Materials - Properties of conductive, insulating, and magnetic materials