Ken-Ichi Sakakibara, Leonardo Fuks, Hiroshi Imagawa, Niro Tayama: Growl Voice in Ethnic and Pop Styles


Growl Voice in Ethnic and Pop Styles

Ken-Ichi Sakakibara (1 2), Leonardo Fuks (3), Hiroshi Imagawa(4), Niro Tayama(5)

1NTTCommunication Science Laboratories, NTT Corporation, Japan

2Department of Otolaryngology, The University of Tokyo, Japan School of Music,

3Universidade Federal do Rio de Janeiro, Brazil

4Department of Speech Physiology, The University of Tokyo, Japan

5International Medical Center of Japan, Japan


Among the so-called extended vocal techniques, vocal growl is a rather common effect in some ethnic (e.g. the Xhosa people in South Africa) and pop styles (e.g. Jazz, Louis Armstrong-type) of music. Growl usually consists of simultaneous vibrations of the vocal folds and supraglottal structures of the larynx, either in harmonic or subharmonic co-oscillation.

This paper examines growl mechanism using videofluoroscopy and high-speed imaging, and its acousitcal characteristics by spectral analysis and model simulation. In growl, the larynx position is usually high and aryepiglottic folds vibrate. The aryepiglottic constriction is associated to a unique shape of the vocal tract, including the larynx tube, and characterizes growl.

Ken-ichi Sakakibara, Hiroshi Imagawa, Seiji Niimi: Vocal fold and false vocal fold vibrations in throat singing and synthesis of khoomei



Vocal fold and false vocal fold vibrations in throat singing and synthesis of khoomei
  • 2 Files
Vocal fold and false vocal fold vibrations in throat singing andsynthesis of kh¨omei
Ken-Ichi Sakakibara
, Hiroshi Imagawa
,Tomoko Konishi, Kazumasa Kondo,Emi Zuiki Murano
, Masanobu Kumada
, and Seiji Niimi
NTT Communication Science Laboratories,
The University of Tokyo,
National Rehabilitation Center for the Disabled,
International University of Health and Welfare
We observed laryngeal movements in throat singing using physiological methods: the simultaneous recording of singing sounds, EGG, and high-speed digital images. We observed vocal fold and false vocal fold vibration and estimated the vibration patterns. We also estimated the laryngeal voices by using an inverse filtering method and simulated the vibration pattern using a new physical model:
-mass model. From these observations, we propose a laryngeal voice model for throat singing and synthesis system of throat singing.
1 Introduction
Throat singing is a traditional singing style of peo-ple who live around the Altai mountains. Kh¨omeiin Tyva and Kh¨omij in Mongolia are representa-tive styles of throat singing. Throat singing is some-times called biphonic singing, multiphonic singing,overtone singing, or harmonic singing because two ormore distinct pitches (musical lines) are produced si-multaneously in one tone. One is a low sustainedfundamental pitch, called a drone, and the secondone is a whistle-like harmonic that resonates high (inthe range from 1 kHz to 3 kHz) above the drone.Many variations of singing styles in throat singingare classified according to singers and regions. How-ever, it is possible to objectively classify these varia-tions in the terms of a source-filter model in speechproduction.The laryngeal voices of throat singing can be clas-sified into (i) a pressed voice and (ii) a kargyraa voicebased on listener’s impression, acoustical character-istics, and the singer’s personal observation on voiceproduction. The pressed voice is the basic laryngealvoice in throat singing and used as drone. The kar-gyraa voice is a very low pitched voice that rangesout of the modal register.The production of the high pitched overtone ismainly due to the pipe resonance of the cavity fromthe larynx to the point of articulation in the vo-cal tract [1]. In Tyvan kh¨omei, sygit is a stylewhere singers articulate by touching the tongue tothe palate and kh¨omei is one where they articulateby pursing the lips.We have physiologically observed two different la-ryngeal voices and estimated the patterns of the vo-cal fold and false vocal fold vibrations [6]. We havealso simulated the vibration patterns by a physicalmodeling of the larynx: 2
2-mass model. Basedon the physiological observations and the simulation,we propose a new laryngealvoice model and synthesissystem for throat singing.
2 Physiological observations
2.1 Methods
We observed laryngeal movements in throat singingdirectly and indirectly by simultaneous recording of high-speed digital images, EGG (Electroglottogra-phy) waveforms, and sound waveforms (Fig. 1). Thehigh-speed digital images were captured through afiberscope inserted into the nose cavity of a singerat 4501 frames/s. Sound and EGG waveforms weresampled at 12 b/s and 18 kHz sf [4]. Two singers,who are normal, participated as subjects. One stud-ied kh¨omei in Tyva and the other studied kh¨omij in Mongolia.
Fig.1: High-speed digital image system.
2.2 Results
Common laryngeal movements are observed amongtwo singers for each of the two laryngeal voices.
contact: K.-I. Sakakibara,
, NTT Communication Science Labs, 3-1, Morinosato Wakamiya, Atsugi-shi, 243-0198, Japan
Pressed voice
In pressed-voice production, the following features of the laryngeal movements were observed. (1) Overallconstriction of the supra-structures of the glottis wasobserved, thus it was difficult to directly observe vi-brations of vocal folds (VFs). (2) Vibration of thesupra-structures of the glottis, whose edges are pre-sumably false vocal folds (FVFs), was observed indigital high-speed images. (3) The period of FVFsvibrations was almost equal to the period of the EGGwaveform. (4) The slope of the EGG curve changedin the beginning of the closed phase of the FVFs, theimpedance of the EGG reached the maximal valuewhen the FVFs were open, and reached the minimalvalue when they were closed (Fig. 2). The graph atthe bottom of Fig. 2 depicts the locus of the edge of FVFs. The upper line (the lower line) is the locus of the left (right, respectively) edges of FVFs.
Kargyraa voice
In kargyraa-voice production, the following featuresof the laryngeal movement were observed. (1) Over-all constriction at the supra-structures of the glottiswas observed. (2) The constriction was looser thanthat in the case of the pressed voice. (3) Vibrationof the supra-structures of the glottis, whose edges arepresumably FVFs. (4) The phases of FVF vibrationsare observed to alternate between almost completelyclosed and open. (5) Vibration of the VFs was ob-served during the open period of the FVFs. (6) Thedouble period of vibration of the FVFs were equalto the period of the sound waveform. (7) When theFVFs almost completely closed, the power of soundbecame weaker. (8) In the EGG waveform, two dif-ferent shapes alternated, and the period of the EGGwaveform was equal to that of the sound waveform(Fig. 3).
Fig. 2: Pressed voice(from above, sound, EGG, edges of FVF).Fig. 3: Kargyraa voice(from above, sound, EGG, edges of FVF).
2.3 Discussion
Two common features were observed among themechanisms of the two different laryngeal voice pro-ductions: (1) Overall constriction of the supra-structures of the glottis and (2) vibration of thesupra-structures of the glottis, which presumably areFVFs. These features are not observed in vowel pro-duction in ordinary speech. The differences amongthe two different laryngeal voice productions are (1)narrowness of the constriction and (2) the manner of FVF vibration.The EGG waveforms for the pressed voice andkarygraa voice represent the contact area of thesupra-structures of the glottis as well as that of theVFs. However, taking into account the high-speeddigital images and sound waveforms, the EGG wave-forms can be assumed to mainly represent the contactarea of VFs. Thus, we can conclude that VF vibra-tions and FVF vibrations have the opposite phase inthe pressed-voice case . In the kargyraa voice, theFVFs can be assumed to close once for every two pe-riods of closure of the VFs, and this closing blocksairflow and contributes to the generation of the sub-harmonic tone of kargyraa.In a previous study, the open quotient (OQ) inthroat singing was estimated to be smaller from theacoustical feature [2]. However, for both the pressedand kargyraa voice, our physiological observationsuggests that the OQ is difficult to estimate becauseof the contribution of the supra-structuresof the glot-tis. Therefore the OQ was not estimated.In the synthesis of the throat singing sounds, aspointed out in [1], glottal source modeling is neededfor reproduction of the timber. Our physiological ob-servations suggests that the glottal source model of throat singing should include the FVF vibrations aswell as the VF vibrations [7].
3 Laryngeal voice model of throat singing
In this paper, we define the glottal airflow as the air-flow through glottis to the area between FVFs andthe laryngeal airflow as the airflow through the areabetween FVFs to the pharynx.
Glottal airflow estimation
From recorded sounds, we estimated laryngealairflowusing the inverse filtering technique. In the pressedvoice, the estimated laryngeal airflow curve had asmall notch just after the curve reached a peak, andthe closing of the VFs was apparently not complete
(Fig. 4). In the kargyraa voice, the estimated la-ryngeal airflow curve has two peaks in each period.From our physiological observation, the VFs vibratetwice in each period of the FVF vibration, and theestimated laryngeal airflow curve showed that in oneof the two vibrations of VFs, the closing of VFs werenot completed (Fig. 5).
Fig. 4: Inverse filtered laryngeal airflow of pressedvoices for two singers.
Fig. 5: Inverse filtered laryngeal airflow of kargyraavoices for two singers.
All the power spectra of the estimated glottal air-flows showed an increase of power in the range from1 to 3 kHz, which is where the second formant fre-quency which corresponds the whistle-like overtoneappears in throat singing (Fig. 6–8).
Fig. 6: Inverse filtered airflow spectrum of normal voicefor two singers.Fig. 7: Inverse filtered airflow spectrum of pressed voicefor two singers.Fig. 8: Inverse filtered airflow spectrum of karygraavoice for two singers.
A 2
2-mass model
For a physical simulation of the VF and FVF vi-brations, we propose a 2
2-mass model as a self-oscillating model of VF and FVF vibrations (Fig.9). This model was devised by introducing a two-mass model for the FVFs to the ordinary two-massmodel for the VFs. The mechanical transmission of vibrations between the VFs and FVFs were not con-sidered. The laryngeal ventricle is a cylinder whosesectional area is uniformally 5 cm
and height is 16 cmand not deformed. In the simulation the 2
2-massmodel oscillated stably. The simulation of laryngealmovements using the 2
2-mass model agreed withthe above assumptions for the two laryngeal move-ment patterns of throat singing for both the pressedand kargyraa voices (Fig. 10). The 2
2-mass modelcan simulate ordinary glottal source in the same wayas the two-mass model by setting suitable model pa-rameters [3].
VocalfoldsFalsevocalfoldsLaryngealVentricleVocal tractTrachea
Fig. 9: 2
2-mass model for the VFs and FVFs.
Sound waveformLaryngeal airflow
1000 cc/s
Fig. 10: Laryngeal airflow obtained by using 2
2-massmodel(left: pressed voice, right: kargyraa voice).
Laryngeal voice model
From the physiological observations and estimatedlaryngeal voices, we assume (1) in pressed-voice pro-duction, VFs and FVFs vibrate in almost oppositephase; (2) in karygraa-voice production, two closed
phases of the VFs appeared in one period of a glottalvolume flow waveform, and VFs were incompletelyclosed at one of the two closed phases. Under theseassumptions, we propose a laryngeal voice model forthroat singing and synthesized throat singing sounds.Our proposed laryngeal voice model is obtainedas follows: We generate almost sine-shaped glottalairflow, because the glottal flow of the throat singingmust be symmetric from Fig. 4 (Step 1). The glottalairflow is modulated by the vibration of the FVFs(Step 2). Turbulent noise is added according to theopen width of the FVFs (Step 3). The output is con-voluted with the transfer function of the laryngealventricle (Step 4)[3].
Laryngeal ventricle resonanceglottal airflowAg: glottal areaFalse glottalareaLaryngealairflow
Fig. 11: Block diagram for laryngeal voice model.
4 Synthesis of throat singing
Based on a Klatt synthesizer [5], we propose synthe-sis model for throat singing, which has the proposedlaryngeal voice model as source and time-varying for-mants obtained from recorded throat singing soundsas resonating filters (Fig. 12). Compared with an or-dinary glottal airflow model, some improvements of the timbre were observed.
We observed the laryngeal movements in throatsinging. The VF and FVF vibrations were observed.The FVF vibrations contribute to production of boththe two laryngeal voices of throat singing. We also es-timated the laryngeal voice source and simulated thelaryngeal movements by using a 2
2-mass model.Based on these observations, we proposed a laryn-geal source model and synthesis model for throatsinging. These models can also simulate the normalvoice. Consequently, all the power spectrum of thesimulated glottal airflows showed the increase of thepower on the range less than 3 kHz where the secondformant frequency which corresponds the whistle-likeovertone in throat singing. Our study indicates theglottal source also contributes the whistle-like over-tone production as well as the articulation of thetongue and lips.
Fig. 12: Block diagram of kh¨o¨omei synthesizer.Fig. 13: Synthesized laryngeal airflows, synthesizedsounds by kh¨omei synthesis system, and power spectraof sythesized souds (left: pressed voice, right: kargyraavoice).
We wish to thank Seiji Adachi, Zoya Kyrgys,Koichi Makigami, Naotoshi Osaka, Yoshinao Shiraki,and Masahiko Todoriki for their help and useful dis-cussion.
[1] S. Adachi and M. Yamada. An acoustical study of soundproduction in biphonic singing x¨omij.
 J. Acoust. Soc.Am.
, 105(5):2920–2932, 1999.[2] G. Bloothooft, E. Bringmann, M. van Cappellen, J. B. vanLuipen, and K. P. Thomassen. Acoustics and perceptionof overtone singing.
 J. Acoust. Soc. Am 
, 92(4):1827–1836,1992.[3] H. Imagawa, K.-I. Sakakibara, T. Konishi, E. Z. Murano,and S. Niimi. Throat singing synthesis by a laryngealvoice model based on vocal fold and false vocal fold vi-brations.
 Tech. Rep. IECE 
, SP2000-140:71–78, Feb. Japanese.[4] S. Kiritani, H. Imagawa, and H. Hirose. Vocal cord vibra-tion in the production of consonants-observation by meansof high-speed digital imaging using a fiberscope.
 J. Acoust.Soc. Jpn. (E)
, 17:1–8, 1996.[5] D. H. Klatt. Software for a cascade/parallel formant syn-thesizer.
 J. Acoust. Soc. Am.
, 67(3):971–995, 1980.[6] T. C. Levin and M. E. Edgerton. The throat singers of tuva.
 Scientific America 
, (Sep.1999):80–87, 1999.[7] K.-I. Sakakibara, S. Adachi, T. Konishi, K. Kondo, E. Z.Murano, M. Kumada, M. Todoriki, H. Imagawa, and S. Ni-imi. Observation of vocal fold vibrations in tyvan and mon-golian throat singing.
 Tech. Rep. Musical Acoust., Acoust.Soc. Jpn 
, 19-4:41–48, Sep. 2000. in Japanese.
Ken-ichi Sakakibara
Seiji Niimi



Singing in the MRI with Tyley Ross Making the Voice Visible

Singing in the MRI with Tyley Ross Making the Voice Visible

Ajoutée le 23 avr. 2019

Tyley Ross is a Grammy nominated recording artist, the co-founder of the Universal Records recording act The East Village Opera Company, and a Dora Award winning musical theater actor. He is based in New York City.

15 minutes of dynamic MRI voice videos – including rapping, beat boxing, singing

15 minutes of dynamic MRI voice videos – including rapping, beat boxing, singing

Ajoutée le 12 juil. 2017

This is the demonstration of dynamic MRI videos that used at @whatsinavoice17 at the Summer Science Exhibition 2017. Includes: Reeps One, April Fredrick, Thermoflynamics, Prof Elemental, Jess Ramsey, Catharine Woodward, Ellie Yuan, Duncan Wisbey, Hector Scott-Manly, Jonny Berliner, Howard Read and Lesley Garrett.

Trailer “The Voice” (DVD) – Insights into the Physiology of Singing an Speaking

Trailer “The Voice” (DVD) – Insights into the Physiology of Singing an Speaking

Ajoutée le 28 juin 2017

Authors: Bernhard Richter, Matthias Echternach, Louisa Traser, Michael Burdumy, Claudia Spahn 160 min., DVD-ROM for PC/Mac Languages: German/English You can purchase this DVD here:… Informations about the DVD: Which singer would not like to see exactly what is happening in his or her body while singing? Instrumentalists, such as string players, pianists or guitarists, can observe their sound generation at any particular point in time. However, this is not readily possible for singers, since the crucial movements of the larynx and the diaphragm, as well as the tongue and the velum (soft palate), are hidden inside the body and not visible from outside. In an innovative Video-DVD-ROM the Freiburg Institute of Institute of Musicians’ Medicine has utilized modern, high-end visualization procedures from the field of medicine to gain insights into the processes in the human body during singing and speaking. First of all, the anatomical structures are precisely explained in the films so that even medical non-experts can identify and understand the functional interactions taking place. The processes made visible in the films are explained in spoken commentaries. The approx. 100 film clips incorporate active recordings of the lungs and the diaphragm, the larynx and the vocal tract during various song styles (classical/pop/yodeling/overtone singing) and in different vocal ranges (soprano, alto, tenor, bass), as well as during speaking (laughing/crying). These films provide unique insights into the physiology of the creation and forming of sounds, as well as the breathing support process. These materials can help the learner better understand and improve his or her singing abilities. This is the case for ambitious laypersons as well as for professionals artistically active in various genres. In addition, they enable the singing instructor to more clearly and concisely illustrate these complex processes to students. This DVD-ROM will be a great benefit for all singers!


The Vocal Tract – Vocal Resonance.

The Vocal Tract – Vocal Resonance.

Ajoutée le 22 mai 2018

Johan Sundberg – “Vocal Tract resonance in singing”. 1987.… Manuel Garcia – “Hints on Singing”,1894. pgs 12-13. Douglas Stanley – “Your Voice” “Applied science and vocal art. Singing and speaking” . 1945. pg 61 William Vennard – “Singing the mechanism and the technic” 1967. pg 82, paragraph 297.