Inohara, Ken, Sumita, Yuka I., Ohbayashi, Naoto, Ino, Shuichi, Kurabayashi, Tohru, Ifukube, Tohru, and Taniguchi, Hisashi
Journal of Voice. Jul2010, Vol. 24 Issue 4, p503-509. 7p.
Vocal tract, Voice disorders, Tomography, Rapid prototyping, and Head & neck cancer patients
Summary: Postoperative head and neck cancer patients suffer from speech disorders, which are the result of changes in their vocal tracts. Making a solid vocal tract model and measuring its transmission characteristics will provide one of the most useful tools to resolve the problem. In binary conversion of X-ray computed tomographic (CT) images for vocal tract reconstruction, nonobjective methods have been used by many researchers. We hypothesized that a standardized vocal tract model could be reconstructed by adopting the Hounsfield number of fat tissue as a criterion for thresholding of binary conversion, because its Hounsfield number is the nearest to air in the human body. The purpose of this study was to establish a new standardized method for binary conversion in reconstructing three-dimensional (3-D) vocal tract models. CT images for postoperative diagnosis were secondarily obtained from a CT scanner. Each patient''s minimum settings of Hounsfield number for the buccal fat-pad regions were measured. Thresholds were set every 50 Hounsfield units (HU) from the bottom line of the buccal fat-pad region to −1024HU, the images were converted into binary values, and were evaluated according to the three-grade system based on anatomically defined criteria. The optimal threshold between tissue and air was determined by nonlinear multiple regression analyses. Each patient''s minimum settings of the buccal fat-pad regions were obtained. The optimal threshold was determined to be −165HU from each patient''s minimum settings of the Hounsfield number for the buccal fat-pad regions. To conclude, a method of 3-D standardized vocal tract modeling was established. [Copyright &y& Elsevier]
Journal of Voice; May2017, Vol. 31 Issue 3, p389.e1-389.e8, 1p
Summary Objective To determine the impact of jitter and shimmer on the degree of naturalness perception of synthesized vowels produced by acoustical simulation with glottal pulses (GP) and with solid model of the vocal tract (SMVT). Study Design Prospective study. Methods Synthesized vowels were produced in three steps: 1. Eighty GP were developed (20 with jitter, 20 with shimmer, 20 with jitter+shimmer, 20 without perturbation); 2. A SMVT was produced based on magnetic resonance imaging (MRI) from a woman during phonation-/ε/ and using rapid prototyping technology; 3. Acoustic simulations were performed to obtain eighty synthesized vowels-/ε /. Two experiments were performed. First Experiment : three judges rated 120 vowels (20 humans+80 synthesized+20% repetition) as “human” or “synthesized”. Second Experiment : twenty PowerPoint slide sequences were created. Each slide had 4 synthesized vowels produced with the four perturbation condition. Evaluators were asked to rate the vowels from the most natural to the most artificial. Results First Experiment : all the human vowels were classified as human; 27 out of eighty synthesized vowels were rated as human, 15 of those were produced with jitter+shimmer, 10 with jitter, 2 without perturbation and none with shimmer. Second Experiment : Vowels produced with jitter+shimmer were considered as the most natural. Vowels with shimmer and without perturbation were considered as the most artificial. Conclusions The association of jitter and shimmer increased the degree of naturalness of synthesized vowels. Acoustic simulations performed with GP and using SMVT demonstrated a possible method to test the effect of the perturbation measurements on synthesized voices. [ABSTRACT FROM AUTHOR]