Vojtech_AQL_2023.pdf

Validation of an automated relative fundamental frequency analysis for clinical voice evaluation

The pattern of change in voice fundamental frequency during intervocalic syllables can be a biomarker for dysregulated laryngeal muscle tension. Methods to identify this pattern—termed relative fundamental frequency (RFF)—require manual intervention, which poses a challenge for clinical translation. We show that acoustic-driven machine learning and signal processing techniques can be used to automate RFF estimation with preserved accuracy.

Gill, A., Raiff, L., Kirchgessner, E., Stepp, C.E., Kline, J.C., & Vojtech, J.M. “Validation of an automated relative fundamental frequency analysis for clinical voice evaluation,” The 15th International Conference on Advances in Quantitative Laryngology, Voice and Speech Research, Phoenix, AZ, USA, March 30–April 1, 2023.

Vojtech_VIB_2023.pdf

Prediction of voice fundamental frequency and intensity from surface electromyographic signals of the face and neck

Silent speech interfaces (SSIs) enable speech recognition and synthesis in the absence of an acoustic signal. Yet, the archetypal SSI fails to convey the expressive attributes of prosody such as pitch and loudness, leading to lexical ambiguities. Here, we demonstrate feasibility for using surface electromyography as an approach for predicting continuous acoustic estimates of prosody.  

Vojtech, J.M., Mitchell, C.L., Raiff, L. Kline, J.C., & De Luca, G. “Prediction of Voice Fundamental Frequency and Intensity from Surface Electromyographic Signals of the Face and Neck,” Vibration, 5(4), 692–710, 2023. doi: 10.3390/vibration5040041 

Vojtech_MTI_2022.pdf

Ability-based methods for personalized keyboard generation

Here, we introduce an ability-based method for personalized keyboard generation that uses an individual’s cursor control over time, distance, and direction to automatically compute a personalized virtual keyboard layout. This work underscores the importance of integrating a user’s motor abilities when designing virtual interfaces.

Mitchell, C.L., Cler, G.J., Fager, S.K., Contessa, P., Roy, S.H., De Luca, G., Kline, J.C., & Vojtech, J.M. “Ability-Based Methods for Personalized Keyboard Generation,” Multimodal Technologies and Interaction. 2022, 6, 67. doi: 10.3390/mti6080067

Vojtech_CHI_2022.pdf

Ability-based keyboards for augmentative and alternative communication: Understanding how individuals’ movement patterns translate to more efficient keyboards

This study presents an evaluation of ability-based methods extended to keyboard generation for alternative communication in people with dexterity impairments due to motor disabilities. We highlight key observations relating to the heterogeneity of the manifestation of motor disabilities, perceived importance of communication technology, and quantitative improvements in communication performance when characterizing an individual's movement abilities to design personalized AAC interfaces.

Mitchell, C.M., Cler, G.J., Fager, S.K., Contessa, P., Roy, S.H., De Luca, G., Kline, J.C., & Vojtech, J.M. “Ability-based Keyboards for Augmentative and Alternative Communication: Understanding How Individuals’ Movement Patterns Translate to More Efficient Keyboards,” In CHI Conference on Human Factors in Computing Systems Extended Abstracts (CHI EA '22). Association for Computing Machinery, New York, NY, USA, Article 412, 1–7, 2022. doi: 10.1145/3491101.3519845

Vojtech_JV_2021.pdf

Effects of age and Parkinson's disease on the relationship between vocal fold abductory kinematics and relative fundamental frequency

This study reports on two experiments to examine vocal fold abduction and its relationship with relative fundamental frequency (RFF), considering two attributes that have been shown to elicit group differences in RFF: age and Parkinson's disease status. RFF is sensitive to changes in vocal fold abductory patterns during devoicing, irrespective of speaker age or Parkinson's disease status.

Vojtech_AS_2021.pdf

Acoustic identification of the voicing boundary during intervocalic offsets and onsets based on vocal fold vibratory measures

Methods for automating relative fundamental frequency (RFF)—an acoustic estimate of laryngeal tension—rely on manual identification of voiced/unvoiced boundaries from acoustic signals. Incorporating acoustic features that corresponded with voiced/unvoiced boundaries led to improvements in boundary detection accuracy that surpassed the gold-standard method for calculating RFF.

Vojtech_JSLHR_2021.pdf

Surface EMG-based recognition, synthesis and perception of prosodic subvocal speech

This study aimed to evaluate a novel communication system designed to translate surface electromyographic (sEMG) signals from articulatory muscles into speech using a personalized, digital voice. We establish the feasibility of using subvocal sEMG-based alternative communication not only for lexical recognition but also for prosodic communication in healthy individuals, as well as those living with vocal impairments and residual articulatory function.

Vojtech, J.M., Chan, M.D., Shiwani, B., Roy, S.H., Heaton, J.T., Meltzner, G.S., Contessa, P., De Luca, G., Patel, R., & Kline, J.C. “Surface EMG-based recognition, synthesis and perception of prosodic subvocal speech,” Journal of Speech, Language, and Hearing Research, 64(6S), 2134–53, 2021. doi: 10.1044/2021_JSLHR-20-00257

Vojtech_CMS_2020_1.pdf

Integrated head-tilt & surface electromyographic cursor control for augmentative and alternative communication

This study evaluated two head control-based access methods that could be used for 2-D cursor control of an augmentative and alternative communication system: one based on surface electromyographic and acclerometry, and one based on computer vision technology.

Vojtech, J.M., Hablani, S., Cler, G.J., & Stepp, C.E., “Integrated head-tilt & surface electromyographic cursor control for augmentative and alternative communication,” Conference on Motor Speech, Santa Barbara, CA, USA, February 19–23, 2020.

Vojtech_IEEE_2020.pdf

Integrated head-tilt and electromyographic cursor control

We evaluated the performance of two alternate computer access methods that could be used for two-dimensional cursor control. The first method, ACC/sEMG, integrates head acceleration and facial surface electromyography. The second method, Camera Mouse, is a free-to-use, computer vision-based access method. We show that our ACC/sEMG system is an effective computer access method across different lighting conditions and computer orientations, but that there is a tradeoff between speed (Camera Mouse) and accuracy (ACC/sEMG). Future development will focus on evaluating performance of each method in populations with limited motor abilities.

Vojtech, J.M., Hablani, S., Cler, G.J., & Stepp, C.E. “Integrated head-tilt and electromyographic cursor control,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, 28(6), 1442–51, 2019. doi: 10.1109/TNSRE.2020.2987144

Vojtech_JASA_2019.pdf

Refining algorithmic estimation of relative fundamental frequency: Accounting for sample characteristics and fundamental frequency estimation method

Relative fundamental frequency (RFF) is a promising acoustic measure for evaluating voice disorders. Yet, the accuracy of the current RFF algorithm varies across a broad range of vocal signals. We show that refining fundamental frequency estimation and accounting for sample characteristics leads to increased correspondence with manual RFF.

Vojtech, J.M., Segina, R. K., Buckley, D. P., Kolin, K. R., Tardif, M. C., Noordzij, J. P., & Stepp, C. E. “Refining algorithmic estimation of relative fundamental frequency: Accounting for sample characteristics and fundamental frequency estimation method,” The Journal of the Acoustical Society of America, 146(5), 3184, 2019. doi: 10.1121/1.5131025

Vojtech_JSLHR_2019.pdf

Adductory vocal fold kinematic trajectories during conventional versus high-speed videoendoscopy

Prephonatory vocal fold angle trajectories may supply useful information about the laryngeal system but were examined in previous studies using sigmoidal curves fit to data collected at 30 frames per second (fps). Here, high-speed videoendoscopy (HSV) was used to investigate the impacts of video frame rate and sigmoidal fitting strategy on vocal fold adductory patterns for voicing onsets. We find that vocal fold kinematic behavior during adduction is generally sigmoidal in adults with typical voices, although such fits can produce substantial errors when data are acquired at frame rates lower than 120 fps.

Díaz-Cádiz, M., McKenna, V.S., Vojtech, J.M., & Stepp, C.E. “Adductory vocal fold kinematic trajectories during conventional speed versus high-speed videoendoscopy,” Journal of Speech, Language, and Hearing Research, 62(6), 1687–1706, 2019. doi: 10.1044/2019_JSLHR-S-18-0405

Vojtech_AJSLP_2019.pdf

The effects of modulating fundamental frequency and speech rate on the intelligibility, communication efficiency, and perceived naturalness of synthetic speech

This study investigated how modulating fundamental frequency (f0) and speech rate differentially impact the naturalness, intelligibility, and communication efficiency of synthetic speech. We show evidence of a trade-off in naturalness and intelligibility of synthesized speech, which may impact future speech synthesis designs.

Vojtech, J.M., Noordzij Jr., J.P., Cler, G.J., & Stepp, C.E. “Effects of fundamental frequency and speech rate on the intelligibility, communication efficiency, and perceived naturalness of synthetic speech,” American Journal of Speech-Language Pathology, 28(2S), 875–86, 2019. doi: 10.1044/2019_AJSLP-MSC18-18-0052

Vojtech_CMS_2018_2.pdf

Effects of prosody on the intelligibility, communication efficiency, and perceived naturalness of synthetic speech in augmentative and alternative communication

Surface electromyography (sEMG) is a promising computer access method for individuals with motor impairments. However, optimal sensor placement is a tedious task requiring trial-and-error by an expert, particularly when recording from facial musculature likely to be spared in individuals with neurological impairments. Here, we demonstrate that non-experts can place sEMG sensors in the vicinity of usable muscle sites for computer access and healthy individuals will learn to efficiently control a humanmachine interface.

Vojtech, J.M., Noordzij Jr., J.P., Cler, G.J., & Stepp, C.E. “Effects of prosody on the intelligibility, communication efficiency, and perceived naturalness of synthetic speech in augmentative and alternative communication,” Conference on Motor Speech, Savannah, GA, USA, February 22–25, 2018.

Vojtech_CMS_2018_1.pdf

Predicting optimal surface electromyographic control of communication devices in individuals with motor speech disorders

We sought to improve the clinical applicability of using surface electromyography as a computer access method for individuals with motor speech disorders by reducing the complexity of sensor placement. We describe difficulties in predicting user performance, likely as a result of the diverse manifestation of motor speech disorders.

Vojtech, J.M., Cler, G.J., Fager, S., & Stepp, C.E. “Predicting optimal surface electromyographic control of communication devices in individuals with motor speech disorders,” Conference on Motor Speech, Savannah, GA, USA, February 22–25, 2018.

Vojtech_IEEE_2018.pdf

Prediction of optimal facial electromyographic sensor configurations for human–machine interface control

Surface electromyography is a promising computer access method for individuals with motor impairments, but sensor placement is tedious, requiring trial and error application and removal by an expert. We demonstrate that non-experts can place these electromyographic sensors in the vicinity of usable muscle sites for computer access and healthy individuals will learn to efficiently control a humanmachine interface.

Vojtech, J.M., Cler, G.J., & Stepp, C.E. “Prediction of optimal facial electromyographic sensor configurations for human-machine interface control,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, 26(8), 1566–76, 2018. doi: 10.1109/TNSRE.2018.2849202

Vojtech_BSMCS_2018.pdf

Evaluation of facial electromyographic sensor configuration for optimizing human-machine interface control

This work describes the identification of quantitative features to predict sensor configurations on the face for improved electromyographic cursor control. 

Vojtech, J.M., Cler, G.J., Noordzij, Jr., J.P., & Stepp, C.E., “Evaluation of facial electromyographic sensor configuration for optimizing human-machine interface control,” Boston Speech Motor Control Mini Symposium, Boston University, Boston, MA, USA, March 31, 2017.