Ken-Ichi Sakakibara,Leonardo Fuks,Hiroshi Imagawa, Niro Tayama : Growl voice in and pop styles

Proceedings of the International Symposium on Musical Acoustics, March 31st to April 3rd 2004 (ISMA2004), Nara, Japan

Ken-Ichi Sakakibara , Leonardo Fuks , HiroshiImagawa , Niro Tayama
NTT Communication Science Laboratories, NTT Corporation, Japan
Department of Otolaryngology, The University of Tokyo, Japan
School of Music, Universidade Federal do Rio de Janeiro, Brazil
Department of Speech Physiology, The University of Tokyo, Japan
International Medical Center of Japan, Japan
kis@brl.ntt.co.jp leofuks@serv.com.ufrj.br
imagawa@m.u-tokyo.ac.jp ntayama@imcj.hosp.go.jp

Growl voice in ethnic and pop styles

Article (PDF Available) · May 2 with 356 Reads

Cite this publication

 

 

 

 

  • Department of Speech Physiology, The University of Tokyo, Japan
    International Medical Center of Japan, Japan
    Department of Otolaryngology, The University of Tokyo, Japan
Abstract
Among the so-called extended vocal techniques, vocal growl is a rather common effect in some ethnic (e.g. the Xhosa people in South Africa) and pop styles (e.g. Jazz, Louis Armstrong-type) of music. Growl usually consists of simultaneous vibrations of the vocal folds and supra-glottal structures of the larynx, either in harmonic or sub-harmonic co-oscillation. This paper examines growl mechanism using vide-ofluoroscopy and high-speed imaging, and its acousit-cal characteristics by spectral analysis and model simu-lation. In growl, the larynx position is usually high and aryepiglottic folds vibrate. The aryepiglottic constriction is associated to a unique shape of the vocal tract, includ-ing the larynx tube, and characterizes growl.

1. Introduction

The term growl is originally referred to as low-pitched
sounds uttered by animals, such as dogs, or similar
sounds by humans, and therefore is mainly described
by auditory-perceptual impression. Growl is widely ob-
served in singing as well as in shouting and aroused
speech.
The growl phonation has been also referred to as the
phonation observed in some singing styles, such as the
jazz singing style of Louis Armstrong 1and Cab Cal-
loway, [2, 3]. Many jazz, blues, and gospel singers often
use growl in a similar manner. Besides such pop musics
from North America, growl styles are widely found in
pop music of other areas: in Brazil, samba singers, par-
ticularly in carnival lead voices, pop star Elza Soares, and
country singing duoBruno& Marrone; in Japan, Enka (a
popular emotive style) singers, such as Harumi Miyako,
employ it frequently. Some singers use growl extensively
through a song, while others use it as a vocal effect for
expressive emphasis.
In ethnic music, one of the most prominent use of
growl is found in umngqokolo, which is a vocal tradition
of the Xhosa people in South Africa [11]. In Japanese
theatre, Noh percussionist’s voice, Kakegoe, may present
growl at the beginning of phonation.
Growl may have perceptual similarities with the
rough or harsh voice. In terms of phonetics, growl
is sometimes described as the voiced aryepiglottic trill
[3]. However, there is no clear evidence of its produc-
tion mechanism, such as physiological observation of the
aryepglottic vibration.
In throat singing (Tyvan khoomei and Mongolian
khoomij), ventricular and vocal fold vibration was ob-
served for the two different laryngeal voices (drone and
kargyraa) [4, 9]. In drone, the basic voice in throat
singing with a whistle-like high overtone, the ventricular
fold vibration is at the same frequency as the vocal fold
vibration. In kargyraa, which usually sounds one octave
(or more) lower than the modal register, the ventricular
folds vibrate at when the vocal folds vibrate at .
Moreover, some singers can do triple-periodic kargyraa
in which the ventricular folds vibrate at .
In this paper, the phonation mode with ventricular and vocal fold
vibration is called VVM (vocal-ventricular mode) [4]. In
growl, there is no clear evidence of the ventricular fold
vibration.

TO READ THE WHOLE ARTICLE , PLEASE CLICK ON THE LINK BELOW/

https://www.researchgate.net/publication/228485036_Growl_voice_in_ethnic_and_pop_styles

Ken-Ichi Sakakibara, Leonardo Fuks, Hiroshi Imagawa, Niro Tayama: Growl Voice in Ethnic and Pop Styles

CLICK ON THIS LINK TO READ THE WHOLE PAPER :

https://documentcloud.adobe.com/link/track?uri=urn%3Aaaid%3Ascds%3AUS%3A528dbdd7-5280-4bf2-b448-c79009c9a6d2

Growl Voice in Ethnic and Pop Styles

Ken-Ichi Sakakibara (1 2), Leonardo Fuks (3), Hiroshi Imagawa(4), Niro Tayama(5)

1NTTCommunication Science Laboratories, NTT Corporation, Japan

2Department of Otolaryngology, The University of Tokyo, Japan School of Music,

3Universidade Federal do Rio de Janeiro, Brazil

4Department of Speech Physiology, The University of Tokyo, Japan

5International Medical Center of Japan, Japan

Abstract

Among the so-called extended vocal techniques, vocal growl is a rather common effect in some ethnic (e.g. the Xhosa people in South Africa) and pop styles (e.g. Jazz, Louis Armstrong-type) of music. Growl usually consists of simultaneous vibrations of the vocal folds and supraglottal structures of the larynx, either in harmonic or subharmonic co-oscillation.

This paper examines growl mechanism using videofluoroscopy and high-speed imaging, and its acousitcal characteristics by spectral analysis and model simulation. In growl, the larynx position is usually high and aryepiglottic folds vibrate. The aryepiglottic constriction is associated to a unique shape of the vocal tract, including the larynx tube, and characterizes growl.

Ken-ichi Sakakibara, Hiroshi Imagawa, Seiji Niimi: Vocal fold and false vocal fold vibrations in throat singing and synthesis of khoomei

 

 

Vocal fold and false vocal fold vibrations in throat singing and synthesis of khoomei
2000
  • 18 Views
  • 2 Files
Download
Vocal fold and false vocal fold vibrations in throat singing andsynthesis of kh¨omei
Ken-Ichi Sakakibara
1
, Hiroshi Imagawa
2
,Tomoko Konishi, Kazumasa Kondo,Emi Zuiki Murano
2
, Masanobu Kumada
3
, and Seiji Niimi
4
1
NTT Communication Science Laboratories,
 ∗
2
The University of Tokyo,
3
National Rehabilitation Center for the Disabled,
 ∗
4
International University of Health and Welfare
Abstract
We observed laryngeal movements in throat singing using physiological methods: the simultaneous recording of singing sounds, EGG, and high-speed digital images. We observed vocal fold and false vocal fold vibration and estimated the vibration patterns. We also estimated the laryngeal voices by using an inverse filtering method and simulated the vibration pattern using a new physical model:
 2
×
2
-mass model. From these observations, we propose a laryngeal voice model for throat singing and synthesis system of throat singing.
1 Introduction
Throat singing is a traditional singing style of peo-ple who live around the Altai mountains. Kh¨omeiin Tyva and Kh¨omij in Mongolia are representa-tive styles of throat singing. Throat singing is some-times called biphonic singing, multiphonic singing,overtone singing, or harmonic singing because two ormore distinct pitches (musical lines) are produced si-multaneously in one tone. One is a low sustainedfundamental pitch, called a drone, and the secondone is a whistle-like harmonic that resonates high (inthe range from 1 kHz to 3 kHz) above the drone.Many variations of singing styles in throat singingare classified according to singers and regions. How-ever, it is possible to objectively classify these varia-tions in the terms of a source-filter model in speechproduction.The laryngeal voices of throat singing can be clas-sified into (i) a pressed voice and (ii) a kargyraa voicebased on listener’s impression, acoustical character-istics, and the singer’s personal observation on voiceproduction. The pressed voice is the basic laryngealvoice in throat singing and used as drone. The kar-gyraa voice is a very low pitched voice that rangesout of the modal register.The production of the high pitched overtone ismainly due to the pipe resonance of the cavity fromthe larynx to the point of articulation in the vo-cal tract [1]. In Tyvan kh¨omei, sygit is a stylewhere singers articulate by touching the tongue tothe palate and kh¨omei is one where they articulateby pursing the lips.We have physiologically observed two different la-ryngeal voices and estimated the patterns of the vo-cal fold and false vocal fold vibrations [6]. We havealso simulated the vibration patterns by a physicalmodeling of the larynx: 2
×
2-mass model. Basedon the physiological observations and the simulation,we propose a new laryngealvoice model and synthesissystem for throat singing.
2 Physiological observations
2.1 Methods
We observed laryngeal movements in throat singingdirectly and indirectly by simultaneous recording of high-speed digital images, EGG (Electroglottogra-phy) waveforms, and sound waveforms (Fig. 1). Thehigh-speed digital images were captured through afiberscope inserted into the nose cavity of a singerat 4501 frames/s. Sound and EGG waveforms weresampled at 12 b/s and 18 kHz sf [4]. Two singers,who are normal, participated as subjects. One stud-ied kh¨omei in Tyva and the other studied kh¨omij in Mongolia.
EGG
Fig.1: High-speed digital image system.
2.2 Results
Common laryngeal movements are observed amongtwo singers for each of the two laryngeal voices.
contact: K.-I. Sakakibara,
 kis@brl.ntt.co.jp
, NTT Communication Science Labs, 3-1, Morinosato Wakamiya, Atsugi-shi, 243-0198, Japan
Pressed voice
In pressed-voice production, the following features of the laryngeal movements were observed. (1) Overallconstriction of the supra-structures of the glottis wasobserved, thus it was difficult to directly observe vi-brations of vocal folds (VFs). (2) Vibration of thesupra-structures of the glottis, whose edges are pre-sumably false vocal folds (FVFs), was observed indigital high-speed images. (3) The period of FVFsvibrations was almost equal to the period of the EGGwaveform. (4) The slope of the EGG curve changedin the beginning of the closed phase of the FVFs, theimpedance of the EGG reached the maximal valuewhen the FVFs were open, and reached the minimalvalue when they were closed (Fig. 2). The graph atthe bottom of Fig. 2 depicts the locus of the edge of FVFs. The upper line (the lower line) is the locus of the left (right, respectively) edges of FVFs.
Kargyraa voice
In kargyraa-voice production, the following featuresof the laryngeal movement were observed. (1) Over-all constriction at the supra-structures of the glottiswas observed. (2) The constriction was looser thanthat in the case of the pressed voice. (3) Vibrationof the supra-structures of the glottis, whose edges arepresumably FVFs. (4) The phases of FVF vibrationsare observed to alternate between almost completelyclosed and open. (5) Vibration of the VFs was ob-served during the open period of the FVFs. (6) Thedouble period of vibration of the FVFs were equalto the period of the sound waveform. (7) When theFVFs almost completely closed, the power of soundbecame weaker. (8) In the EGG waveform, two dif-ferent shapes alternated, and the period of the EGGwaveform was equal to that of the sound waveform(Fig. 3).
Fig. 2: Pressed voice(from above, sound, EGG, edges of FVF).Fig. 3: Kargyraa voice(from above, sound, EGG, edges of FVF).
2.3 Discussion
Two common features were observed among themechanisms of the two different laryngeal voice pro-ductions: (1) Overall constriction of the supra-structures of the glottis and (2) vibration of thesupra-structures of the glottis, which presumably areFVFs. These features are not observed in vowel pro-duction in ordinary speech. The differences amongthe two different laryngeal voice productions are (1)narrowness of the constriction and (2) the manner of FVF vibration.The EGG waveforms for the pressed voice andkarygraa voice represent the contact area of thesupra-structures of the glottis as well as that of theVFs. However, taking into account the high-speeddigital images and sound waveforms, the EGG wave-forms can be assumed to mainly represent the contactarea of VFs. Thus, we can conclude that VF vibra-tions and FVF vibrations have the opposite phase inthe pressed-voice case . In the kargyraa voice, theFVFs can be assumed to close once for every two pe-riods of closure of the VFs, and this closing blocksairflow and contributes to the generation of the sub-harmonic tone of kargyraa.In a previous study, the open quotient (OQ) inthroat singing was estimated to be smaller from theacoustical feature [2]. However, for both the pressedand kargyraa voice, our physiological observationsuggests that the OQ is difficult to estimate becauseof the contribution of the supra-structuresof the glot-tis. Therefore the OQ was not estimated.In the synthesis of the throat singing sounds, aspointed out in [1], glottal source modeling is neededfor reproduction of the timber. Our physiological ob-servations suggests that the glottal source model of throat singing should include the FVF vibrations aswell as the VF vibrations [7].
3 Laryngeal voice model of throat singing
In this paper, we define the glottal airflow as the air-flow through glottis to the area between FVFs andthe laryngeal airflow as the airflow through the areabetween FVFs to the pharynx.
Glottal airflow estimation
From recorded sounds, we estimated laryngealairflowusing the inverse filtering technique. In the pressedvoice, the estimated laryngeal airflow curve had asmall notch just after the curve reached a peak, andthe closing of the VFs was apparently not complete
(Fig. 4). In the kargyraa voice, the estimated la-ryngeal airflow curve has two peaks in each period.From our physiological observation, the VFs vibratetwice in each period of the FVF vibration, and theestimated laryngeal airflow curve showed that in oneof the two vibrations of VFs, the closing of VFs werenot completed (Fig. 5).
SoundEGGLaryngealairflowAirflowderivative
Fig. 4: Inverse filtered laryngeal airflow of pressedvoices for two singers.
SoundEGGAirflowderivativeLaryngealairflow
Fig. 5: Inverse filtered laryngeal airflow of kargyraavoices for two singers.
All the power spectra of the estimated glottal air-flows showed an increase of power in the range from1 to 3 kHz, which is where the second formant fre-quency which corresponds the whistle-like overtoneappears in throat singing (Fig. 6–8).
Fig. 6: Inverse filtered airflow spectrum of normal voicefor two singers.Fig. 7: Inverse filtered airflow spectrum of pressed voicefor two singers.Fig. 8: Inverse filtered airflow spectrum of karygraavoice for two singers.
A 2
×
2-mass model
For a physical simulation of the VF and FVF vi-brations, we propose a 2
×
2-mass model as a self-oscillating model of VF and FVF vibrations (Fig.9). This model was devised by introducing a two-mass model for the FVFs to the ordinary two-massmodel for the VFs. The mechanical transmission of vibrations between the VFs and FVFs were not con-sidered. The laryngeal ventricle is a cylinder whosesectional area is uniformally 5 cm
2
and height is 16 cmand not deformed. In the simulation the 2
×
2-massmodel oscillated stably. The simulation of laryngealmovements using the 2
×
2-mass model agreed withthe above assumptions for the two laryngeal move-ment patterns of throat singing for both the pressedand kargyraa voices (Fig. 10). The 2
×
2-mass modelcan simulate ordinary glottal source in the same wayas the two-mass model by setting suitable model pa-rameters [3].
VocalfoldsFalsevocalfoldsLaryngealVentricleVocal tractTrachea
Fig. 9: 2
×
2-mass model for the VFs and FVFs.
Sound waveformLaryngeal airflow
1000 cc/s
Fig. 10: Laryngeal airflow obtained by using 2
×
2-massmodel(left: pressed voice, right: kargyraa voice).
Laryngeal voice model
From the physiological observations and estimatedlaryngeal voices, we assume (1) in pressed-voice pro-duction, VFs and FVFs vibrate in almost oppositephase; (2) in karygraa-voice production, two closed
phases of the VFs appeared in one period of a glottalvolume flow waveform, and VFs were incompletelyclosed at one of the two closed phases. Under theseassumptions, we propose a laryngeal voice model forthroat singing and synthesized throat singing sounds.Our proposed laryngeal voice model is obtainedas follows: We generate almost sine-shaped glottalairflow, because the glottal flow of the throat singingmust be symmetric from Fig. 4 (Step 1). The glottalairflow is modulated by the vibration of the FVFs(Step 2). Turbulent noise is added according to theopen width of the FVFs (Step 3). The output is con-voluted with the transfer function of the laryngealventricle (Step 4)[3].
Laryngeal ventricle resonanceglottal airflowAg: glottal areaFalse glottalareaLaryngealairflow
Fig. 11: Block diagram for laryngeal voice model.
4 Synthesis of throat singing
Based on a Klatt synthesizer [5], we propose synthe-sis model for throat singing, which has the proposedlaryngeal voice model as source and time-varying for-mants obtained from recorded throat singing soundsas resonating filters (Fig. 12). Compared with an or-dinary glottal airflow model, some improvements of the timbre were observed.
Conclusion
We observed the laryngeal movements in throatsinging. The VF and FVF vibrations were observed.The FVF vibrations contribute to production of boththe two laryngeal voices of throat singing. We also es-timated the laryngeal voice source and simulated thelaryngeal movements by using a 2
×
2-mass model.Based on these observations, we proposed a laryn-geal source model and synthesis model for throatsinging. These models can also simulate the normalvoice. Consequently, all the power spectrum of thesimulated glottal airflows showed the increase of thepower on the range less than 3 kHz where the secondformant frequency which corresponds the whistle-likeovertone in throat singing. Our study indicates theglottal source also contributes the whistle-like over-tone production as well as the articulation of thetongue and lips.
Fig. 12: Block diagram of kh¨o¨omei synthesizer.Fig. 13: Synthesized laryngeal airflows, synthesizedsounds by kh¨omei synthesis system, and power spectraof sythesized souds (left: pressed voice, right: kargyraavoice).
Acknowledgments
We wish to thank Seiji Adachi, Zoya Kyrgys,Koichi Makigami, Naotoshi Osaka, Yoshinao Shiraki,and Masahiko Todoriki for their help and useful dis-cussion.
Bibliography
[1] S. Adachi and M. Yamada. An acoustical study of soundproduction in biphonic singing x¨omij.
 J. Acoust. Soc.Am.
, 105(5):2920–2932, 1999.[2] G. Bloothooft, E. Bringmann, M. van Cappellen, J. B. vanLuipen, and K. P. Thomassen. Acoustics and perceptionof overtone singing.
 J. Acoust. Soc. Am 
, 92(4):1827–1836,1992.[3] H. Imagawa, K.-I. Sakakibara, T. Konishi, E. Z. Murano,and S. Niimi. Throat singing synthesis by a laryngealvoice model based on vocal fold and false vocal fold vi-brations.
 Tech. Rep. IECE 
, SP2000-140:71–78, Feb. 2001.in Japanese.[4] S. Kiritani, H. Imagawa, and H. Hirose. Vocal cord vibra-tion in the production of consonants-observation by meansof high-speed digital imaging using a fiberscope.
 J. Acoust.Soc. Jpn. (E)
, 17:1–8, 1996.[5] D. H. Klatt. Software for a cascade/parallel formant syn-thesizer.
 J. Acoust. Soc. Am.
, 67(3):971–995, 1980.[6] T. C. Levin and M. E. Edgerton. The throat singers of tuva.
 Scientific America 
, (Sep.1999):80–87, 1999.[7] K.-I. Sakakibara, S. Adachi, T. Konishi, K. Kondo, E. Z.Murano, M. Kumada, M. Todoriki, H. Imagawa, and S. Ni-imi. Observation of vocal fold vibrations in tyvan and mon-golian throat singing.
 Tech. Rep. Musical Acoust., Acoust.Soc. Jpn 
, 19-4:41–48, Sep. 2000. in Japanese.
  Message
Ken-ichi Sakakibara
1.0
  Message
Seiji Niimi
1.0