Validation of an automated relative fundamental frequency analysis for clinical voice evaluation
The pattern of change in voice fundamental frequency during intervocalic syllables can be a biomarker for dysregulated laryngeal muscle tension. Methods to identify this pattern—termed relative fundamental frequency (RFF)—require manual intervention, which poses a challenge for clinical translation. We show that acoustic-driven machine learning and signal processing techniques can be used to automate RFF estimation with preserved accuracy.
Gill, A., Raiff, L., Kirchgessner, E., Stepp, C.E., Kline, J.C., & Vojtech, J.M. “Validation of an automated relative fundamental frequency analysis for clinical voice evaluation,” The 15th International Conference on Advances in Quantitative Laryngology, Voice and Speech Research, Phoenix, AZ, USA, March 30–April 1, 2023.
Prediction of voice fundamental frequency and intensity from surface electromyographic signals of the face and neck
Silent speech interfaces (SSIs) enable speech recognition and synthesis in the absence of an acoustic signal. Yet, the archetypal SSI fails to convey the expressive attributes of prosody such as pitch and loudness, leading to lexical ambiguities. Here, we demonstrate feasibility for using surface electromyography as an approach for predicting continuous acoustic estimates of prosody.
Vojtech, J.M., Mitchell, C.L., Raiff, L. Kline, J.C., & De Luca, G. “Prediction of Voice Fundamental Frequency and Intensity from Surface Electromyographic Signals of the Face and Neck,” Vibration, 5(4), 692–710, 2023. doi: 10.3390/vibration5040041
Ability-based methods for personalized keyboard generation
Here, we introduce an ability-based method for personalized keyboard generation that uses an individual’s cursor control over time, distance, and direction to automatically compute a personalized virtual keyboard layout. This work underscores the importance of integrating a user’s motor abilities when designing virtual interfaces.
Mitchell, C.L., Cler, G.J., Fager, S.K., Contessa, P., Roy, S.H., De Luca, G., Kline, J.C., & Vojtech, J.M. “Ability-Based Methods for Personalized Keyboard Generation,” Multimodal Technologies and Interaction. 2022, 6, 67. doi: 10.3390/mti6080067
Ability-based keyboards for augmentative and alternative communication: Understanding how individuals’ movement patterns translate to more efficient keyboards
This study presents an evaluation of ability-based methods extended to keyboard generation for alternative communication in people with dexterity impairments due to motor disabilities. We highlight key observations relating to the heterogeneity of the manifestation of motor disabilities, perceived importance of communication technology, and quantitative improvements in communication performance when characterizing an individual's movement abilities to design personalized AAC interfaces.
Mitchell, C.M., Cler, G.J., Fager, S.K., Contessa, P., Roy, S.H., De Luca, G., Kline, J.C., & Vojtech, J.M. “Ability-based Keyboards for Augmentative and Alternative Communication: Understanding How Individuals’ Movement Patterns Translate to More Efficient Keyboards,” In CHI Conference on Human Factors in Computing Systems Extended Abstracts (CHI EA '22). Association for Computing Machinery, New York, NY, USA, Article 412, 1–7, 2022. doi: 10.1145/3491101.3519845
Effects of age and Parkinson's disease on the relationship between vocal fold abductory kinematics and relative fundamental frequency
This study reports on two experiments to examine vocal fold abduction and its relationship with relative fundamental frequency (RFF), considering two attributes that have been shown to elicit group differences in RFF: age and Parkinson's disease status. RFF is sensitive to changes in vocal fold abductory patterns during devoicing, irrespective of speaker age or Parkinson's disease status.
Vojtech, J.M., & Stepp, C.E. “Effects of Age and Parkinson’s disease on the Relationship between Vocal Fold Abductory Kinematics and Relative Fundamental Frequency,” Journal of Voice, In Press. doi: 10.1016/j.jvoice.2022.03.007
Vojtech, J.M. & Stepp, C.E., “Effects of Age, Sex, and Parkinson's Disease on Kinematic and Acoustic Features of Phonatory Offset,” The 14th International Conference on Advances in Quantitative Laryngology, Voice and Speech Research, Bogotá, Colombia, June 7–10, 2021.
Acoustic identification of the voicing boundary during intervocalic offsets and onsets based on vocal fold vibratory measures
Methods for automating relative fundamental frequency (RFF)—an acoustic estimate of laryngeal tension—rely on manual identification of voiced/unvoiced boundaries from acoustic signals. Incorporating acoustic features that corresponded with voiced/unvoiced boundaries led to improvements in boundary detection accuracy that surpassed the gold-standard method for calculating RFF.
Vojtech, J.M., Cilento, D.D., Luong, A.T., Noordzij, Jr., J.P., Diaz-Cadiz, M.C., Groll, M.D., Buckley, D.P., McKenna, V.S., Noordzij, J.P., & Stepp, C.E. “Acoustic identification of the voicing boundary during intervocalic offsets and onsets based on vocal fold vibratory measures,” Applied Sciences, 11(9), 3816, 2021. doi: 10.3390/app11093816
Vojtech, J.M., Cilento, D.D., Luong, A.T., Noordzij, Jr., J.P., Diaz-Cadiz, M.C., Groll, M.D., Buckley, D.P., McKenna, V.S., Noordzij, J.P., & Stepp, C.E., “Acoustic Identification of the Voicing Boundary during Intervocalic Offsets and Onsets based on Vocal Fold Vibratory Measures,” The 14th International Conference on Advances in Quantitative Laryngology, Voice and Speech Research, Bogotá, Colombia, June 7–10, 2021.
Surface EMG-based recognition, synthesis and perception of prosodic subvocal speech
This study aimed to evaluate a novel communication system designed to translate surface electromyographic (sEMG) signals from articulatory muscles into speech using a personalized, digital voice. We establish the feasibility of using subvocal sEMG-based alternative communication not only for lexical recognition but also for prosodic communication in healthy individuals, as well as those living with vocal impairments and residual articulatory function.
Vojtech, J.M., Chan, M.D., Shiwani, B., Roy, S.H., Heaton, J.T., Meltzner, G.S., Contessa, P., De Luca, G., Patel, R., & Kline, J.C. “Surface EMG-based recognition, synthesis and perception of prosodic subvocal speech,” Journal of Speech, Language, and Hearing Research, 64(6S), 2134–53, 2021. doi: 10.1044/2021_JSLHR-20-00257
Integrated head-tilt & surface electromyographic cursor control for augmentative and alternative communication
This study evaluated two head control-based access methods that could be used for 2-D cursor control of an augmentative and alternative communication system: one based on surface electromyographic and acclerometry, and one based on computer vision technology.
Vojtech, J.M., Hablani, S., Cler, G.J., & Stepp, C.E., “Integrated head-tilt & surface electromyographic cursor control for augmentative and alternative communication,” Conference on Motor Speech, Santa Barbara, CA, USA, February 19–23, 2020.
Integrated head-tilt and electromyographic cursor control
We evaluated the performance of two alternate computer access methods that could be used for two-dimensional cursor control. The first method, ACC/sEMG, integrates head acceleration and facial surface electromyography. The second method, Camera Mouse, is a free-to-use, computer vision-based access method. We show that our ACC/sEMG system is an effective computer access method across different lighting conditions and computer orientations, but that there is a tradeoff between speed (Camera Mouse) and accuracy (ACC/sEMG). Future development will focus on evaluating performance of each method in populations with limited motor abilities.
Vojtech, J.M., Hablani, S., Cler, G.J., & Stepp, C.E. “Integrated head-tilt and electromyographic cursor control,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, 28(6), 1442–51, 2019. doi: 10.1109/TNSRE.2020.2987144
Refining algorithmic estimation of relative fundamental frequency: Accounting for sample characteristics and fundamental frequency estimation method
Relative fundamental frequency (RFF) is a promising acoustic measure for evaluating voice disorders. Yet, the accuracy of the current RFF algorithm varies across a broad range of vocal signals. We show that refining fundamental frequency estimation and accounting for sample characteristics leads to increased correspondence with manual RFF.
Vojtech, J.M., Segina, R. K., Buckley, D. P., Kolin, K. R., Tardif, M. C., Noordzij, J. P., & Stepp, C. E. “Refining algorithmic estimation of relative fundamental frequency: Accounting for sample characteristics and fundamental frequency estimation method,” The Journal of the Acoustical Society of America, 146(5), 3184, 2019. doi: 10.1121/1.5131025
Adductory vocal fold kinematic trajectories during conventional versus high-speed videoendoscopy
Prephonatory vocal fold angle trajectories may supply useful information about the laryngeal system but were examined in previous studies using sigmoidal curves fit to data collected at 30 frames per second (fps). Here, high-speed videoendoscopy (HSV) was used to investigate the impacts of video frame rate and sigmoidal fitting strategy on vocal fold adductory patterns for voicing onsets. We find that vocal fold kinematic behavior during adduction is generally sigmoidal in adults with typical voices, although such fits can produce substantial errors when data are acquired at frame rates lower than 120 fps.
Díaz-Cádiz, M., McKenna, V.S., Vojtech, J.M., & Stepp, C.E. “Adductory vocal fold kinematic trajectories during conventional speed versus high-speed videoendoscopy,” Journal of Speech, Language, and Hearing Research, 62(6), 1687–1706, 2019. doi: 10.1044/2019_JSLHR-S-18-0405
The effects of modulating fundamental frequency and speech rate on the intelligibility, communication efficiency, and perceived naturalness of synthetic speech
This study investigated how modulating fundamental frequency (f0) and speech rate differentially impact the naturalness, intelligibility, and communication efficiency of synthetic speech. We show evidence of a trade-off in naturalness and intelligibility of synthesized speech, which may impact future speech synthesis designs.
Vojtech, J.M., Noordzij Jr., J.P., Cler, G.J., & Stepp, C.E. “Effects of fundamental frequency and speech rate on the intelligibility, communication efficiency, and perceived naturalness of synthetic speech,” American Journal of Speech-Language Pathology, 28(2S), 875–86, 2019. doi: 10.1044/2019_AJSLP-MSC18-18-0052
Effects of prosody on the intelligibility, communication efficiency, and perceived naturalness of synthetic speech in augmentative and alternative communication
Surface electromyography (sEMG) is a promising computer access method for individuals with motor impairments. However, optimal sensor placement is a tedious task requiring trial-and-error by an expert, particularly when recording from facial musculature likely to be spared in individuals with neurological impairments. Here, we demonstrate that non-experts can place sEMG sensors in the vicinity of usable muscle sites for computer access and healthy individuals will learn to efficiently control a human–machine interface.
Vojtech, J.M., Noordzij Jr., J.P., Cler, G.J., & Stepp, C.E. “Effects of prosody on the intelligibility, communication efficiency, and perceived naturalness of synthetic speech in augmentative and alternative communication,” Conference on Motor Speech, Savannah, GA, USA, February 22–25, 2018.
Predicting optimal surface electromyographic control of communication devices in individuals with motor speech disorders
We sought to improve the clinical applicability of using surface electromyography as a computer access method for individuals with motor speech disorders by reducing the complexity of sensor placement. We describe difficulties in predicting user performance, likely as a result of the diverse manifestation of motor speech disorders.
Vojtech, J.M., Cler, G.J., Fager, S., & Stepp, C.E. “Predicting optimal surface electromyographic control of communication devices in individuals with motor speech disorders,” Conference on Motor Speech, Savannah, GA, USA, February 22–25, 2018.
Prediction of optimal facial electromyographic sensor configurations for human–machine interface control
Surface electromyography is a promising computer access method for individuals with motor impairments, but sensor placement is tedious, requiring trial and error application and removal by an expert. We demonstrate that non-experts can place these electromyographic sensors in the vicinity of usable muscle sites for computer access and healthy individuals will learn to efficiently control a human–machine interface.
Vojtech, J.M., Cler, G.J., & Stepp, C.E. “Prediction of optimal facial electromyographic sensor configurations for human-machine interface control,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, 26(8), 1566–76, 2018. doi: 10.1109/TNSRE.2018.2849202
Evaluation of facial electromyographic sensor configuration for optimizing human-machine interface control
This work describes the identification of quantitative features to predict sensor configurations on the face for improved electromyographic cursor control.
Vojtech, J.M., Cler, G.J., Noordzij, Jr., J.P., & Stepp, C.E., “Evaluation of facial electromyographic sensor configuration for optimizing human-machine interface control,” Boston Speech Motor Control Mini Symposium, Boston University, Boston, MA, USA, March 31, 2017.