Communication Aid Development
Communication aid development encompasses the design and creation of electronic devices that support alternative communication for individuals who cannot rely on natural speech. These augmentative and alternative communication (AAC) devices range from simple single-message switches to sophisticated speech-generating devices with thousands of vocabulary items, eye-tracking control, and predictive text capabilities.
For the millions of people worldwide affected by conditions such as autism spectrum disorder, cerebral palsy, amyotrophic lateral sclerosis (ALS), stroke, and traumatic brain injury, communication aids can be transformative, providing a voice where none existed before. This article explores the development platforms, hardware components, software frameworks, and design considerations essential for creating effective communication aids.
Understanding AAC Technology
Augmentative and alternative communication technology supports individuals with complex communication needs through a variety of approaches, each suited to different abilities and contexts.
Categories of AAC Devices
AAC devices span a wide range of complexity and capability:
- Single-message devices: Simple switches that play a single pre-recorded message when activated, ideal for initial communication development or specific requests
- Sequential message devices: Record and play back multiple messages in sequence, suitable for storytelling, jokes, or multi-step communications
- Static display devices: Fixed overlays with symbols or words that trigger speech output when pressed, offering consistent visual layout
- Dynamic display devices: Touchscreen interfaces with multiple pages and navigation, providing access to extensive vocabularies
- Text-based devices: Keyboard-based systems for literate users, often incorporating word and phrase prediction
- Hybrid systems: Combine symbol-based and text-based communication to support users transitioning between approaches
Access Methods
How users interact with communication devices depends on their motor abilities:
- Direct selection: Touching or pointing directly to items using fingers, stylus, or head pointer
- Scanning: Items are highlighted sequentially, and the user activates a switch when the desired item is indicated
- Eye gaze: Eye-tracking systems detect where the user is looking and select items based on dwell time or blink
- Head tracking: Camera-based systems track head movement to control cursor position
- Switch arrays: Multiple switches provide directional control and selection
Key Development Considerations
Successful communication aid development requires attention to factors beyond typical electronics design:
- Reliability: Communication aids are essential for daily life; system crashes or hardware failures can leave users unable to communicate
- Latency: Response time affects communication flow; delays exceeding a few hundred milliseconds disrupt natural conversation
- Battery life: Devices must operate throughout the day without requiring recharging during active use
- Durability: Many users have limited motor control, leading to drops, bumps, and exposure to saliva or food
- Customization: Individual needs vary enormously; devices must support extensive personalization
- Integration: Communication aids often need to interface with environmental controls, computers, and social media
Speech Generation Systems
Speech generation, also known as speech synthesis or text-to-speech (TTS), converts text or symbols into spoken output, giving voice to AAC users.
Text-to-Speech Engines
Modern TTS engines use various approaches to generate natural-sounding speech:
- Concatenative synthesis: Assembles speech from recordings of actual human speech units; provides natural sound but requires large databases
- Formant synthesis: Generates speech using mathematical models of vocal tract acoustics; smaller footprint but less natural
- Neural TTS: Deep learning models generate highly natural speech; requires significant processing power or cloud connectivity
- Unit selection: Advanced concatenative approach selecting optimal units from large databases; used in high-quality commercial systems
Available TTS platforms for embedded development include:
- eSpeak NG: Open-source formant synthesizer with small footprint, supporting over 100 languages; suitable for resource-constrained devices
- Flite: Carnegie Mellon's lightweight speech synthesizer derived from Festival; designed for embedded systems
- Pico TTS: SVOX synthesizer included in Android; high quality with moderate resource requirements
- Microsoft Speech Platform: Windows-based TTS with high-quality voices; requires Windows operating system
- Amazon Polly: Cloud-based neural TTS with extensive voice selection; requires internet connectivity
- Google Cloud TTS: Neural synthesis with WaveNet technology; cloud-dependent but extremely natural
Voice Banking and Personalization
For individuals who may lose speech due to progressive conditions, voice banking preserves their personal voice for later use in AAC devices:
- ModelTalker: Free voice banking system from Nemours that creates personalized synthesized voices from recordings
- Acapela my-own-voice: Commercial voice banking service creating personal TTS voices
- VocaliD: Blends voice banking recordings with donor voices to create unique personalized voices
- CereProc CereVoice Me: Creates high-quality personal voices from approximately 50 sentences of recordings
Voice banking requires careful planning, as most services need several hours of recordings while the person can still speak clearly. Development platforms should support importing and using banked voices.
Audio Output Hardware
Speech quality depends significantly on audio output hardware:
- Amplifier ICs: Class D amplifiers such as the MAX98357A or TPA3116D2 provide efficient power delivery with good audio quality
- DAC selection: External DACs like the PCM5102 offer superior audio quality over built-in microcontroller DACs
- Speaker selection: Full-range speakers with adequate bass response improve speech intelligibility; consider directional characteristics for noisy environments
- Enclosure acoustics: Proper speaker mounting and acoustic design significantly affect perceived voice quality
- Volume control: Automatic gain control and easy volume adjustment accommodate varying environments
Symbol-Based Interfaces
Symbol-based communication enables individuals who cannot read or have cognitive disabilities to communicate using pictures, icons, and graphic symbols that represent words, phrases, or concepts.
Symbol Systems and Libraries
Several standardized symbol systems are widely used in AAC:
- Picture Communication Symbols (PCS): Developed by Mayer-Johnson, PCS is the most widely used symbol set with over 45,000 symbols; requires licensing for commercial use
- SymbolStix: Contemporary, stylized symbols with consistent design language; used in many commercial AAC apps
- Widgit Symbols: Symbol system designed for literacy support with clear, schematic designs
- ARASAAC: Free, open-source symbol library with over 12,000 symbols in multiple languages; created by the Government of Aragon, Spain
- Mulberry Symbols: Open-source symbol set with adult-oriented vocabulary often missing from other libraries
- Blissymbolics: Semantic-based symbol system where meaning is encoded in symbol components; supports symbol combination for novel meanings
- Minspeak/Unity: Icon-based system using semantic compaction where sequences of icons represent words; efficient for motor-impaired users
Symbol Display Design
Effective symbol interfaces require careful visual design:
- Grid layout: Consistent grid patterns support motor memory and visual scanning; common sizes range from 2x2 for beginners to 15x10 for advanced users
- Color coding: Fitzgerald Key and similar systems use colors to indicate parts of speech (nouns, verbs, adjectives), supporting grammatical learning
- Symbol size: Larger symbols support users with visual impairments or motor difficulties; dynamic sizing adjusts to user needs
- Contrast and backgrounds: High contrast between symbols and backgrounds improves visibility; customizable color schemes accommodate visual preferences
- Symbol borders: Distinct borders help users distinguish between adjacent symbols
Vocabulary Organization
How symbols are organized affects communication speed and learning:
- Taxonomic organization: Symbols grouped by category (foods, animals, actions); intuitive but requires navigation
- Pragmatic organization: Symbols arranged by communication function (greetings, requests, comments)
- Semantic-syntactic organization: Core vocabulary on home page with category pages for fringe vocabulary
- Alphabetic organization: Symbols arranged alphabetically for literate users
- Activity-based: Vocabulary organized around specific activities or contexts
Research supports the importance of consistent placement of high-frequency core vocabulary, enabling users to develop motor automaticity for commonly used words.
Implementation Approaches
Symbol interface implementation on embedded platforms:
- Image formats: PNG with transparency for symbols; consider image compression for memory-constrained systems
- Database storage: SQLite or similar embedded databases organize symbol metadata, associations, and user customizations
- Rendering: Hardware-accelerated graphics improve responsiveness; consider caching frequently used symbols
- Localization: Symbol text labels must support multiple languages; consider right-to-left layouts for appropriate languages
Text Prediction Systems
Text prediction accelerates communication for literate users by anticipating words and phrases, reducing the number of keystrokes required.
Prediction Approaches
Various prediction technologies support AAC text entry:
- Word completion: Predicts the remainder of words based on initial letters typed
- Word prediction: Suggests likely next words based on context and user history
- Phrase prediction: Predicts complete phrases or sentences commonly used by the individual
- Semantic prediction: Uses understanding of topic and context to improve predictions
- Personal dictionary learning: Adapts to individual vocabulary patterns over time
Language Models
Prediction systems rely on statistical or neural language models:
- N-gram models: Predict based on preceding word sequences; memory-efficient and well-suited to embedded systems
- Neural language models: Recurrent or transformer-based models providing superior prediction but requiring more resources
- Personal corpus training: Models trained on individual's previous communications improve relevance
- Topic adaptation: Dynamically adjusting predictions based on detected conversation topic
Prediction Libraries and Frameworks
Available resources for implementing text prediction:
- Presage: Open-source intelligent predictive text platform with pluggable prediction modules
- OpenAdaptxt: Open-source predictive text framework for mobile platforms
- LanguageTool: Grammar and style checking that can enhance AAC text production
- Custom n-gram implementations: Lightweight prediction using probability tables derived from text corpora
Prediction Interface Design
Effective prediction interfaces minimize cognitive load:
- Prediction display: Show 3-9 predictions; too many overwhelms users, too few limits utility
- Ranking: Most likely predictions should appear in consistent, easy-to-select positions
- Selection methods: Support both direct selection and numbered/labeled selection via switch scanning
- Learning feedback: Indicate when the system learns new words or patterns
- Disambiguation: Handle homographs and context-dependent word selection gracefully
Scanning Interfaces
Scanning enables individuals with significant motor impairments to access communication devices using one or more switches, making AAC accessible to users who cannot use direct selection.
Scanning Methods
Various scanning approaches accommodate different abilities:
- Automatic scanning: Items highlight sequentially at a set rate; user activates switch when desired item is highlighted
- Step scanning: Each switch activation advances to the next item; a second switch or dwell time selects
- Inverse scanning: User holds switch to advance; releasing selects the current item
- Row-column scanning: Rows highlight first, then columns within the selected row; reduces number of selections needed
- Group-item scanning: Items are grouped; scanning first identifies the group, then the item within
- Auditory scanning: Items are announced rather than visually highlighted; supports users with visual impairments
Scan Rate and Timing
Optimal scanning timing depends on individual user abilities:
- Scan rate: Time between automatic advances; typical range 0.5 to 3 seconds depending on user reaction time
- First item delay: Extended time on first item after scan initiation compensates for reaction time
- Acceptance time: How long the switch must be held to register selection; filters accidental activations
- Release time: Delay after switch release before next action; prevents double selections
- Auto-start delay: Time before scanning begins after entering a new page or group
All timing parameters should be adjustable through the device's settings interface.
Switch Hardware
Switches come in many forms to accommodate different motor abilities:
- Button switches: Standard push buttons requiring controlled finger or hand movement
- Plate switches: Large, flat surfaces activated by pressing anywhere; suitable for gross motor movement
- Pillow switches: Soft, squeezable switches for users with limited strength
- Sip-and-puff: Pneumatic switches activated by breath; suitable for users with minimal voluntary movement
- Proximity switches: Activated by approaching rather than touching; useful when contact is difficult
- Muscle switches: EMG-based detection of muscle activity; enables access for users with minimal movement
- Eye blink switches: Optical or EMG detection of intentional eye blinks
Switch Interface Electronics
Connecting switches to AAC devices requires appropriate interface circuits:
- Standard switch jacks: 3.5mm mono or 1/8-inch jacks are industry standard; normally open contacts connect tip to sleeve
- Debouncing: Hardware or software debouncing prevents false triggers from contact bounce
- Wireless switches: Bluetooth or proprietary wireless switches eliminate cable management challenges
- Switch latching: Optional latching mode for users who cannot maintain continuous switch pressure
- Multi-switch support: Two-switch scanning requires distinguishing between switch 1 (advance) and switch 2 (select)
Partner-Assisted Scanning
Partner-assisted scanning is a low-tech approach where a communication partner provides the scanning by verbally or visually presenting options to the AAC user.
Techniques and Approaches
Partner-assisted scanning can be implemented in several ways:
- Auditory scanning: Partner reads options aloud; user signals when target item is spoken
- Visual scanning: Partner points to items on a communication board; user signals at target
- Eye pointing: Partner interprets user's eye gaze direction to identify selections
- Yes/no questions: Partner asks binary questions to narrow down the message
- Written choice: Partner writes options for literate users to select visually
Training Support Tools
Electronics can support partner-assisted scanning training and practice:
- Training apps: Software that guides partners through proper scanning techniques
- Timing feedback: Devices that measure and display scan timing to help partners maintain consistent pace
- Video modeling: Recorded examples demonstrating effective partner-assisted scanning
- Progress tracking: Systems that log communication attempts and successes for therapy planning
Hybrid Approaches
Combining electronic and partner-assisted scanning:
- Electronic backup: High-tech device available when trained partners are not present
- Alphabet supplementation: User indicates first letter using high-tech device while partner provides word guessing
- Topic setting: Electronic device establishes conversation topic; partner-assisted scanning continues from there
Vocabulary Management
Effective vocabulary management is essential for communication aid success, ensuring users have access to words and phrases relevant to their lives.
Core and Fringe Vocabulary
AAC vocabulary is typically divided into two categories:
- Core vocabulary: Approximately 200-400 high-frequency words that comprise 80% of typical communication; includes pronouns, verbs, adjectives, and common nouns
- Fringe vocabulary: Lower-frequency but personally important words; includes names, places, specific interests, and context-specific vocabulary
Development platforms should support maintaining both core vocabulary that remains consistent and fringe vocabulary that is regularly updated.
Vocabulary Selection Considerations
Choosing appropriate vocabulary involves multiple factors:
- Age appropriateness: Vocabulary and symbols should match the user's developmental and chronological age
- Personal relevance: Include names, favorite activities, important places, and individual interests
- Social language: Greetings, jokes, slang, and expressions that enable social participation
- Academic vocabulary: School-age users need vocabulary supporting classroom participation
- Emotional vocabulary: Words expressing feelings support social-emotional development
- Grammar words: Function words enabling grammatically complete sentences
Vocabulary Editing Tools
Communication aid platforms require robust vocabulary editing capabilities:
- Symbol editing: Replace default symbols with personal photos or alternative representations
- Button programming: Associate text, speech output, and actions with vocabulary items
- Page creation: Create new vocabulary pages for specific contexts or activities
- Navigation links: Connect pages through buttons that jump to related vocabulary
- Import/export: Share vocabulary configurations between devices or users
- Backup and restore: Protect vocabulary customization from device loss or failure
Usage Logging and Analytics
Tracking vocabulary usage supports therapy and development:
- Frequency analysis: Identify which vocabulary items are used most and least often
- Navigation patterns: Understand how users move through vocabulary pages
- Communication rate: Measure words per minute to track progress
- Error analysis: Identify vocabulary items causing selection errors
- Goal tracking: Monitor progress toward therapy objectives
Privacy considerations are essential when logging communication content; many systems offer options to log metadata without capturing actual messages.
Personalization Tools
Extensive personalization is essential for communication aids to meet individual needs effectively.
Visual Customization
Visual appearance affects usability and acceptance:
- Color schemes: Customizable colors for backgrounds, buttons, and highlighting accommodate visual preferences and needs
- Font selection: Support for various fonts, sizes, and styles; consider dyslexia-friendly fonts
- Symbol size: Adjustable symbol and button sizes from large (few items) to small (many items)
- High contrast modes: Support for users with low vision through enhanced contrast options
- Animation settings: Control over scanning highlights, transitions, and visual feedback
Voice Customization
Voice output should reflect user identity:
- Voice selection: Choose from multiple voices matching user's age, gender, and regional accent
- Rate adjustment: Speaking rate control from slow and deliberate to rapid
- Pitch modification: Adjust voice pitch to sound more natural for individual users
- Custom pronunciations: Dictionary entries for names and unusual words
- Voice banking integration: Support for imported personalized voices
Motor Access Customization
Access settings accommodate varied motor abilities:
- Touch parameters: Adjust touch sensitivity, hold time, and release requirements
- Scanning timing: All timing parameters adjustable to individual reaction speeds
- Switch configuration: Map physical switches to various functions
- Keyguards: Support for physical keyguard overlays that prevent accidental touches
- Dwell settings: Configure dwell time for eye gaze and head tracking selection
User Profiles
Support for multiple configurations enables device sharing and context switching:
- Multiple users: Separate profiles for shared devices in classrooms or clinics
- Activity profiles: Different configurations for home, school, work, or therapy
- Progressive settings: Simplified settings that expand as user skills develop
- Backup profiles: Restore points enabling recovery from problematic changes
Development Hardware Platforms
Several hardware platforms are well-suited to communication aid development.
Microcontroller Platforms
For simple communication aids and custom input devices:
- Arduino: Entry-level platform suitable for single-message devices, switch interfaces, and simple projects; extensive community support
- ESP32: Adds WiFi and Bluetooth capability for wireless switches and simple speech playback from audio files
- Teensy: Powerful audio capabilities make it suitable for speech playback; USB device support enables HID functionality
- Adafruit Feather: Compact boards with various connectivity options; good for wearable or small-form-factor devices
Single-Board Computers
For full-featured AAC devices with speech synthesis and complex interfaces:
- Raspberry Pi: Capable of running full AAC software including speech synthesis; extensive peripheral support; active accessibility community
- Beaglebone: Real-time capabilities useful for precise scanning timing and switch debouncing
- NVIDIA Jetson Nano: GPU acceleration enables on-device neural TTS and eye-tracking processing
Tablet Platforms
Commercial tablets serve as AAC device platforms:
- iPad: Dominant platform for AAC apps; excellent touchscreen, accessibility features, and app ecosystem
- Android tablets: Lower cost alternative with growing AAC app availability
- Windows tablets: Support for full Windows AAC software; often paired with eye-tracking systems
Custom hardware development can create specialized peripherals and mounting systems for tablet-based AAC.
Display Technology
Display selection significantly affects AAC device usability:
- Touchscreen displays: Capacitive touchscreens for finger use; resistive for stylus and indirect pointing devices
- Screen size: Balance portability against symbol visibility; common sizes range from 7 to 15 inches
- Resolution: Higher resolution enables more readable text and detailed symbols
- Brightness: Outdoor visibility requires high-brightness displays or anti-glare treatments
- Viewing angle: Wide viewing angles support various mounting positions
Software Frameworks and Resources
Various open-source and commercial resources support communication aid development.
Open-Source AAC Software
Free and open-source AAC platforms provide development starting points:
- CoughDrop: Open-source web-based AAC system with symbol support, scanning, and cloud synchronization
- OpenBoard: Open-source AAC application for Android with symbol-based communication
- AAC Launchpad: Open-source educational platform for AAC vocabulary development
- Asterics: Assistive technology platform for creating custom input devices and accessibility solutions
Development Libraries
Libraries supporting AAC feature development:
- Qt: Cross-platform framework suitable for AAC interfaces with good accessibility support
- React Native: Cross-platform mobile development enabling AAC apps on iOS and Android
- Electron: Desktop application framework for cross-platform AAC software
- Pygame: Python gaming library useful for simple AAC interfaces with audio output
Accessibility APIs
Platform accessibility frameworks support AAC integration:
- iOS Switch Control: Built-in scanning support enabling AAC apps to work with switches
- Android Accessibility Services: Framework for alternative access to Android devices
- Windows UI Automation: Accessibility framework for Windows AAC applications
- AT-SPI: Linux accessibility framework for assistive technology integration
Eye Gaze Technology
Eye-tracking provides access for users with minimal voluntary movement, enabling communication through eye gaze alone.
Eye-Tracking Hardware
Eye-tracking systems use cameras and illumination to detect eye position:
- Tobii Dynavox: Market leader in AAC eye-tracking with integrated devices and standalone trackers
- Tobii Eye Tracker 5: Consumer gaming tracker adaptable for AAC research and development
- Irisbond: Eye-tracking systems designed specifically for AAC access
- EyeTech: Eye-tracking solutions for AAC and accessibility applications
- Open-source trackers: ITU Gaze Tracker and Opengazer provide starting points for custom development
Gaze Interaction Methods
Various techniques translate eye gaze into selections:
- Dwell selection: Looking at a target for a specified duration triggers selection; most common method
- Blink selection: Intentional blink while looking at target confirms selection
- Switch-assisted: Eye gaze positions cursor; external switch confirms selection
- Animated targets: Targets animate to draw attention and confirm gaze detection
- Zoom interfaces: Gaze regions progressively zoom to enable precise selection from dense grids
Development Considerations
Eye gaze AAC development involves specific challenges:
- Calibration: Eye-tracking requires calibration for each user; simplified calibration supports users with cognitive disabilities
- Accuracy: Typical accuracy of 0.5-1 degree limits minimum target size; targets smaller than 1 inch become difficult
- Robustness: Systems must handle head movement, changing lighting, and glasses
- Fatigue: Extended eye gaze use causes fatigue; interface design should minimize gaze travel distance
- Feedback: Visual feedback showing detected gaze position helps users understand system behavior
Integration and Connectivity
Modern communication aids often need to integrate with other systems and technologies.
Environmental Control
AAC devices can provide control over home and environmental systems:
- Infrared control: Learning remote functionality to control TVs, lights, and appliances
- Smart home integration: Connectivity with Alexa, Google Home, Apple HomeKit, and similar platforms
- Bluetooth device control: Pairing with phones, tablets, and Bluetooth-enabled devices
- X10 and Z-Wave: Home automation protocol support for specialized accessibility systems
Computer Access
Communication aids can provide keyboard and mouse functionality:
- USB HID: Devices appear as standard keyboards and mice to computers
- Bluetooth HID: Wireless keyboard and mouse functionality
- Text injection: Send composed messages to active applications
- Screen navigation: Mouse cursor control for full computer access
Social Media and Messaging
Direct integration with communication platforms:
- SMS and text messaging: Send and receive text messages from within AAC interface
- Email integration: Compose and read email messages
- Social media posting: Direct posting to Facebook, Twitter, and other platforms
- Video calling: Integration with Zoom, FaceTime, and similar platforms
Testing and Evaluation
Rigorous testing ensures communication aids meet user needs effectively and reliably.
Usability Testing
Evaluate device usability with representative users:
- Task completion: Measure success rates for specific communication tasks
- Communication rate: Words per minute achieved in realistic scenarios
- Error analysis: Document and analyze selection errors and recovery
- Learning curve: Track improvement over initial exposure period
- Fatigue effects: Assess performance degradation during extended use
Reliability Testing
Ensure devices meet robustness requirements:
- Drop testing: Survive drops from wheelchair height and table height
- Ingress protection: Resistance to saliva, food spills, and cleaning fluids
- Battery endurance: Operate throughout a full day of typical use
- Continuous operation: Extended reliability testing to identify intermittent failures
- Environmental testing: Operation across temperature and humidity ranges encountered in daily life
Accessibility Standards Compliance
Relevant standards for AAC device development:
- Section 508: US federal accessibility requirements for electronic equipment
- EN 301 549: European accessibility requirements for ICT products
- WCAG: Web Content Accessibility Guidelines for software interfaces
- ISO 9999: Classification of assistive products including communication aids
Funding and Distribution
Understanding the AAC market ecosystem informs development decisions.
Funding Sources
Communication aids are funded through various mechanisms:
- Insurance: Private health insurance coverage varies widely; documentation requirements are significant
- Medicaid: Covers AAC devices as durable medical equipment in the United States
- School systems: Required to provide AAC as educationally necessary assistive technology
- Vocational rehabilitation: Funds AAC for employment purposes
- Veterans Affairs: Provides AAC for qualifying veterans
- Private purchase: Lower-cost devices increasingly purchased directly by families
Regulatory Considerations
AAC devices face various regulatory requirements:
- FDA classification: In the US, AAC devices are generally Class I medical devices
- CE marking: European medical device compliance for commercial distribution
- FCC compliance: Required for devices with wireless capabilities
- Documentation requirements: Thorough documentation supports funding approval
Summary
Communication aid development encompasses a broad range of technologies and techniques that enable individuals with complex communication needs to express themselves. From simple single-message switches to sophisticated eye-gaze-controlled speech generators, these devices transform lives by providing a voice to those who cannot speak.
Effective communication aid development requires understanding both the technical challenges of speech synthesis, symbol interfaces, scanning systems, and predictive text, and the human factors that determine whether a device will be adopted and used effectively. User-centered design, extensive customization options, and robust reliability are essential characteristics of successful AAC devices.
The field continues to evolve rapidly, with advances in neural text-to-speech, eye-tracking, and machine learning opening new possibilities for more natural, efficient communication. Open-source platforms and lower-cost hardware are democratizing access to AAC development, enabling makers, therapists, and researchers to create innovative solutions that complement commercial offerings.
For developers entering this field, the combination of technical challenge and profound human impact makes communication aid development uniquely rewarding, offering opportunities to make a meaningful difference in people's lives through thoughtful application of electronics and software engineering.