MEMS Acoustic Devices

Introduction

MEMS acoustic devices are micro-electro-mechanical transducers that convert between sound and electrical signals on a silicon chip. They include the microphones that have become nearly universal in smartphones, laptops, and smart speakers; the emerging class of MEMS loudspeakers built into earbuds and hearing aids; and the ultrasonic transducers used for fingerprint sensing, proximity detection, gesture recognition, and medical imaging. By borrowing the batch fabrication of the semiconductor industry, these devices achieve small size, low cost, excellent unit-to-unit consistency, and easy integration with the signal-processing electronics that accompany them.

This article examines the principal families of MEMS acoustic devices, the transduction mechanisms on which they rely, the fabrication and packaging steps that turn a silicon wafer into a working acoustic component, and the metrics by which their performance is judged. The treatment assumes familiarity with the general principles of micro-electro-mechanical systems and focuses on the features specific to acoustic transduction.

Principles of MEMS Acoustic Transduction

Every acoustic transducer couples a moving mechanical element to the surrounding air or tissue and to an electrical port. In a microphone the sound field drives the mechanical element and the electrical port reports its motion; in a loudspeaker or an ultrasonic emitter the electrical port drives the element and the element radiates sound. The same physical principles run in both directions.

Mechanical Element and Acoustic Coupling

The mechanical element of a MEMS acoustic device is usually a thin diaphragm or membrane, only a few micrometers thick and a fraction of a millimeter across, suspended so that it can deflect under pressure. The compliance of the diaphragm, the mass of the moving structure, and the damping introduced by air moving through narrow gaps together set the frequency response and the resonance of the device. A back cavity behind the diaphragm and a small vent that equalizes static pressure shape the low-frequency behavior, while the diaphragm resonance sets the upper limit of a flat response.

Transduction Mechanisms

Two transduction mechanisms dominate MEMS acoustic devices. Capacitive, or electrostatic, transduction places the diaphragm close to a fixed electrode so that the two form a capacitor whose value changes as the diaphragm moves; a bias charge converts that capacitance change into a voltage, or an applied voltage produces an electrostatic force that moves the diaphragm. Piezoelectric transduction builds the diaphragm from, or coats it with, a piezoelectric thin film that develops a charge when it is strained and that strains when a voltage is applied. Each mechanism has characteristic strengths in sensitivity, bias requirements, reliability, and ease of fabrication, explored in the sections that follow.

Capacitive MEMS Microphones

The capacitive MEMS microphone is by far the most common MEMS acoustic device and one of the highest-volume MEMS products of any kind. It has largely displaced the electret condenser microphone in portable electronics because of its small size, surface-mount compatibility, resistance to reflow soldering, and consistency across large production runs.

Structure and Operation

A capacitive MEMS microphone consists of a flexible diaphragm suspended a small distance from a rigid, perforated backplate, the two forming a parallel-plate capacitor. Sound pressure deflects the diaphragm, changing the gap and therefore the capacitance. The perforations in the backplate let air pass through so that the backplate does not impede the diaphragm, while a sealed or vented back volume sets the acoustic stiffness. The transducer is paired with an application-specific integrated circuit that supplies the bias and buffers or digitizes the signal.

Biasing and the Readout Circuit

Because the readout depends on a fixed charge on the capacitor, the integrated circuit generates a high bias voltage, typically on the order of ten to fifteen volts, from the low supply available in portable equipment by means of an on-chip charge pump. The diaphragm capacitance is small and its source impedance very high, so the first amplifier presents an extremely high input impedance to avoid loading the signal. The companion circuit may provide an analog output or, more commonly in digital microphones, a pulse-density-modulated or integrated-interface digital bitstream that connects directly to a processor.

Strengths and Limitations

Capacitive MEMS microphones offer high sensitivity, a flat and well-controlled frequency response, and a low noise floor, and they tolerate the high temperatures of automated assembly. Their limitations include the need for a charge pump and a high-impedance front end, susceptibility of the narrow air gap to contamination by dust and particles, and a maximum sound level set by the point at which the diaphragm motion becomes nonlinear or the diaphragm contacts the backplate. Careful port and gap design mitigates these limitations.

Piezoelectric MEMS Microphones

Piezoelectric MEMS microphones use a strained piezoelectric film rather than a variable capacitor to sense sound. They occupy a growing niche where their particular advantages outweigh their generally higher self-noise.

Structure and Operation

A piezoelectric microphone replaces the diaphragm-and-backplate capacitor with a cantilever or membrane that incorporates a thin film of a piezoelectric material such as aluminum nitride, scandium-doped aluminum nitride, or a lead zirconate titanate ceramic. When sound bends the structure, the film develops a charge proportional to the strain, which the readout circuit converts to a voltage. Because there is no narrow air gap and no need for a high bias voltage, the device is simpler in some respects than its capacitive counterpart.

Advantages and Trade-offs

The absence of a small air gap makes piezoelectric microphones inherently resistant to dust, water, and particle contamination, an advantage in rugged and waterproof products. They require no bias charge pump, simplifying the integrated circuit and reducing certain noise contributions, and they can offer a very high acoustic overload point, capturing loud sounds without distortion. The principal trade-off is self-noise: piezoelectric microphones have historically exhibited a higher noise floor, and therefore a lower signal-to-noise ratio, than the best capacitive devices, although advances in piezoelectric thin films continue to narrow the gap. Their robustness makes them attractive for waterproof phones, bone-conduction sensing, and harsh environments.

MEMS Speakers

MEMS loudspeakers, sometimes called micro-speakers when realized in silicon, apply the same micromachining techniques to the generation of sound rather than its detection. Once confined to research, they have entered the market in in-ear and hearing-assistance products, where their size and precision are decisive.

Actuation Approaches

A MEMS speaker must move enough air to produce audible sound pressure, which is demanding at the small dimensions of a chip, especially at low frequencies where the required volume displacement is large. Designers pursue several actuation principles. Piezoelectric MEMS speakers flex a membrane with a piezoelectric film and are the most commercially mature, offering low power consumption and a thin profile. Electrostatic designs drive the membrane with electrostatic force, while electrodynamic and other approaches remain largely experimental. Achieving adequate low-frequency output typically requires either a relatively large membrane area, an array of cells working together, or a sealed acoustic volume such as the closed cavity of an in-ear earbud.

Applications and Status

The natural home of the MEMS speaker is the sealed acoustic environment of an in-ear earphone or a hearing aid, where the small enclosed volume raises the sound pressure for a given displacement and where size and integration matter most. In these products MEMS speakers offer fast, well-controlled transient response, excellent channel matching, and compatibility with semiconductor assembly. For loudspeakers that must radiate into open air across the full audio band, conventional moving-coil drivers still dominate, and MEMS speakers complement rather than replace them.

Ultrasonic MEMS Transducers

Beyond the audible band, micromachined ultrasonic transducers generate and detect sound at frequencies from tens of kilohertz to tens of megahertz. They serve fingerprint sensing, gesture and proximity detection, range finding, flow measurement, and medical imaging. Two families dominate, distinguished by their transduction mechanism.

Piezoelectric Micromachined Ultrasonic Transducers

A piezoelectric micromachined ultrasonic transducer, abbreviated PMUT, uses a flexing membrane that carries a piezoelectric thin film. An applied voltage strains the film and bends the membrane to emit an ultrasonic pulse, and a returning echo bends the membrane to generate a charge that the electronics detect. PMUTs operate at relatively low voltages, integrate readily with electronics, and can be arranged in large arrays. They underlie ultrasonic fingerprint sensors built beneath display glass, in which an array images the ridges and valleys of a fingertip by timing reflected pulses, as well as gesture and presence sensing.

Capacitive Micromachined Ultrasonic Transducers

A capacitive micromachined ultrasonic transducer, abbreviated CMUT, is an electrostatic device: a thin membrane suspended over a vacuum or sealed gap forms a capacitor, and a bias voltage combined with an alternating drive vibrates the membrane to emit ultrasound, while incoming ultrasound modulates the capacitance for detection. CMUTs achieve wide bandwidth and good acoustic coupling to tissue and water, which suits them to medical ultrasound imaging and to fully integrated array probes. They require a relatively high bias voltage and careful control of the membrane gap. Both PMUT and CMUT arrays can be co-fabricated or closely integrated with the beam-forming and signal-processing electronics they require.

Applications of Ultrasonic MEMS

Ultrasonic MEMS transducers reach an expanding set of uses. Under-display fingerprint sensors image fingertips acoustically. Time-of-flight and presence sensors measure distance and detect occupancy without optics. Gesture interfaces track hand motion. In medicine, micromachined arrays form the active element of compact and even handheld ultrasound probes, and they support photoacoustic and intravascular imaging. Industrial uses include flow metering and non-destructive evaluation.

Fabrication

MEMS acoustic devices are manufactured with the same lithographic, deposition, and etching processes that produce integrated circuits, adapted to create the suspended diaphragms, sealed cavities, and piezoelectric films that acoustic transduction requires.

Diaphragm and Cavity Formation

The thin diaphragm is typically formed in a structural film of polysilicon, silicon nitride, or single-crystal silicon deposited or defined on a wafer. A sacrificial layer beneath the diaphragm holds its shape during processing and is then removed, in a release step, to free the diaphragm so that it can move; for capacitive microphones the same release defines the air gap between diaphragm and backplate. The back cavity and acoustic ports are commonly etched through the wafer by deep reactive-ion etching, which produces the deep, steep-walled openings that connect the diaphragm to the sound field. Wafer bonding may join a structural wafer to a cap wafer to seal a reference cavity or to enclose the transducer.

Piezoelectric and Backplate Layers

Piezoelectric devices add a film of aluminum nitride, scandium-doped aluminum nitride, or a ceramic such as lead zirconate titanate, deposited with controlled crystal orientation because the piezoelectric response depends on it, together with the metal electrodes that contact it. Capacitive devices add a perforated rigid backplate, often stiffened against deflection, with etched acoustic holes that let air pass. Stress control is critical throughout: residual film stress can warp a thin diaphragm or shift its resonance, so deposition conditions and annealing are tuned to leave the diaphragm flat and predictably compliant.

Integration with Electronics

Because the transducer signals are small and high in impedance, the sensing or driving electronics must sit close to the transducer. Manufacturers either co-fabricate the circuit and transducer on one die or, more often, place a separately optimized application-specific integrated circuit beside the MEMS die within a single package. The latter approach lets each die use its best-suited process while keeping the electrical connection short to preserve signal integrity.

Packaging and Acoustic Ports

Packaging is not a mere enclosure for an acoustic MEMS device; it forms part of the acoustic system and strongly influences performance. The package defines the back volume, routes sound to the diaphragm through a port, and protects the delicate structure while permitting the very coupling to the environment that other MEMS try to exclude.

Top-Port and Bottom-Port Designs

A MEMS microphone package admits sound through a small acoustic port, an opening in the package whose position defines the product type. In a bottom-port design the sound hole passes through the package substrate and aligns with a hole in the host circuit board, so the diaphragm faces downward into the board; the package lid then encloses a sealed back volume that improves low-frequency performance. In a top-port design the sound enters through a hole in the lid, leaving the substrate solid. The choice affects the back-volume size, the frequency response, and the way the microphone integrates with the product enclosure.

The Back Volume and Acoustic Tuning

The sealed volume behind the diaphragm acts as an acoustic spring whose size influences the sensitivity and the low-frequency roll-off; a larger back volume generally improves low-frequency response. The acoustic port and any connecting channel form an acoustic mass that, with the cavity, can create a resonance used deliberately to extend or shape the response. Designers tune these elements, together with the diaphragm, to achieve the target frequency response, and they treat the package, the port, and the host-device opening as a single coupled acoustic system.

Protection and Environmental Sealing

Because acoustic ports must remain open to the air, the package must protect the diaphragm from particles, moisture, and light while still admitting sound. Fine mesh screens, hydrophobic membranes, and carefully placed ports keep contaminants out without unduly impeding sound. Ultrasonic transducers for medical use add acoustic matching and lens layers to couple energy efficiently into tissue. Electromagnetic shielding and resistance to mechanical shock and to the high temperatures of soldering complete the demands placed on the package.

Performance Metrics

The quality of a MEMS acoustic device is captured by a set of standard metrics that allow designs to be compared and matched to an application. The relevant metrics differ somewhat between microphones, speakers, and ultrasonic transducers.

Microphone Metrics

Sensitivity expresses the electrical output produced by a reference sound pressure, stated for analog microphones in decibels relative to a volt per pascal and for digital microphones relative to full scale. The signal-to-noise ratio, conventionally A-weighted and referenced to a sound pressure of 94 decibels SPL (one pascal) at one kilohertz, compares that reference output with the device's own noise floor and is a primary measure of how quiet a sound the microphone can resolve; values in the mid-sixties of decibels are common and the best devices reach the low seventies. The acoustic overload point is the sound pressure level at which distortion reaches a defined limit, marking the loudest sound the microphone can capture cleanly. Other metrics include the frequency response and its flatness, the total harmonic distortion, the power-supply rejection ratio of the integrated circuit, and the consistency of sensitivity and phase between units, which matters for the microphone arrays used in beamforming.

Speaker and Ultrasonic Metrics

A MEMS speaker is judged by the sound pressure level it produces, especially at low frequencies, by its frequency response and bandwidth, by its total harmonic distortion at a given output, and by its power consumption and efficiency. An ultrasonic transducer is characterized by its center frequency and bandwidth, by its transmit sensitivity and receive sensitivity, by the acoustic pressure it can generate, by its directivity, and, for imaging arrays, by the element count, pitch, and uniformity that determine spatial resolution. Across all of these devices, reliability under shock, temperature, and humidity, and stability of the metrics over the product lifetime, are part of the overall assessment.

Summary

MEMS acoustic devices bring the economy and precision of semiconductor manufacturing to the conversion between sound and electricity. Capacitive MEMS microphones, the highest-volume members of the family, sense sound by the change in a parallel-plate capacitance and dominate portable audio; piezoelectric MEMS microphones trade some signal-to-noise ratio for robustness against contamination and freedom from a bias charge pump. MEMS speakers, most maturely realized with piezoelectric actuation, are establishing themselves in the sealed acoustic spaces of earbuds and hearing aids.

Beyond the audible range, piezoelectric and capacitive micromachined ultrasonic transducers, the PMUT and the CMUT, generate and detect ultrasound for fingerprint sensing, gesture and proximity detection, and medical imaging. All of these devices share a common foundation: a micromachined diaphragm fabricated by deposition, lithography, and deep etching; a transduction mechanism that is electrostatic or piezoelectric; close integration with a dedicated readout or drive circuit; and a package whose acoustic ports and back volume form an inseparable part of the acoustic design. Their performance is expressed through sensitivity, signal-to-noise ratio, acoustic overload point, and related metrics, and their continued advance is widening the reach of acoustic sensing and emission across consumer, industrial, and medical electronics.