and Seiji Niimi∗4 ∗1 NTT Communication Science Laboratories, ∗2 The University of Tokyo, ∗3 National Rehabilitation Center for the Disabled, ∗4 International University of Health and Welfare
Throat singing is a traditional singing style of people who live around the Altai mountains. Kh¨ o¨ omei in Tyva and Kh¨ o¨ omij in Mongolia are representative styles of throat singing. Throat singing is sometimes called biphonic singing, multiphonic singing, overtone singing, or harmonic singing because two or more distinct pitches (musical lines) are produced simultaneously in one tone. One is a low sustained fundamental pitch, called a drone, and the second one is a whistle-like harmonic that resonates high (in the range from 1 kHz to 3 kHz) above the drone. Many variations of singing styles in throat singing are classiﬁed according to singers and regions. However, it is possible to objectively classify these variations in the terms of a source-ﬁlter model in speech production. The laryngeal voices of throat singing can be classiﬁed into (i) a pressed voice and (ii) a kargyraa voice based on listener’s impression, acoustical characteristics, and the singer’s personal observation on voice production. The pressed voice is the basic laryngeal voice in throat singing and used as drone. The kargyraa voice is a very low pitched voice that ranges out of the modal register. The production of the high pitched overtone is mainly due to the pipe resonance of the cavity from the larynx to the point of articulation in the vocal tract . In Tyvan kh¨ o¨ omei, sygit is a style where singers articulate by touching the tongue to the palate and kh¨ o¨ omei is one where they articulate by pursing the lips. We have also physiologically observed two diﬀerent laryngeal voices and estimated the patterns of the vocal fold and false vocal fold vibrations . We have simulated the vibration patterns by a physical modeling of the larynx: 2 × 2-mass model. Based on the physiological observations and the simulation, we propose a new laryngeal voice model and synthesis system for throat singing
Università di Padova Dipartimento di Storia delle Arti Visive e della Musica firstname.lastname@example.org
ISTC (Istituto di Scienze e Tecnologie della Cognizione), CNR, Padova email@example.com
Università di Padova Dipartimento di Storia delle Arti Visive e della Musica firstname.lastname@example.org
ABSTRACT Demetrio Stratos (1945-1979) was a singer known for his creative use of vocal techniques such as diplophony, bitonality and diphony (overtone singing). His need to know the scientific explanation for such vocal behaviors, drove him to visit the ISTC in Padova (Institute of Cognitive Sciences and Technologies) in the late Seventies. ISTC technical resources and the collaboration with Franco Ferrero and Lucio Croatto (phonetics and phoniatric experts), allowed him to analyze his own phonoarticulatory system and the effects he was able to produce. This paper presents the results of a broad historical survey of Stratos’ research at the ISTC. The historic investigation is made possible by textual criticism and interpretation based on different sources, digital and audio sources, sketches, various bibliographical references (published or unpublished) and oral communications. Sonograms of Stratos’ exercises (made at the time and recently redone) show that various abilities existed side by side in the same performer, which is rare to find. This marks his uniqueness in the avant-gard and popular music scene of the time. The ultimate aim of this study was to produce a digital archive for the preservation and conservation of the sources related to this period.
The Chöömij of Mongolia A Spectral Analysis of Overtone Singing
SELECTED REPORTS IN Ethnomusicology Volume II, No. 1 1974
CHÖÖMIJ* IS THE MONGOLIAN NAME for a solo style of overtone singing where two distinct pitch lines are sounded throughout. One, a nasal‑sounding drone of relatively constant pitch, corresponds to the fundamental; the other, consisting of piercing, whistle like tones, forms a melody, line above the drone and results from the reinforcement of individual overtones within the ambitus of the 5th through 13th partials.
Reinforcement of partials is achieved by characteristic changes in the shape and volume of the mouth cavity. This is reminiscent of the principle of the Jew’s harp,’ where a vibrating tongue sounded at the lips produces a drone fundamental which the player modifies by shaping his mouth cavity so as to‑form a resonance chamber of critical volume. The volume of this chamber, functioning on the principle of a Helmholtz resonator, reinforces a narrow frequency band area within an existing spectrum. This band is sufficiently narrow to enable the singer to select a given single partial above the drone in accordance with the degree of modification made by him. The principle involving the reinforcement of discrete partials by a specific shaping of the mouth cavity is thus common to both chöömij and the Jew’s harp. A difference, however, lies in the physical origination of the fundamental. In the Jew’s harp it is produced at the lips, in the chöömij it originates in the throat region.
The unusual quality of chöömij arouses special interest. Subjective statements cannot take us very far and we need a more objective basis for describing it. The Melograph Model C offers a mechanical approach to a more accurate and precise representation of this complex vocal phenomenon.
A number of recordings of this style has been made*’ and an analysis of them will appear in a more comprehensive study. I have selected for detailed melographic analysis the initial phrase of one performance which is distinguished by the unusually long duration of its ictus, 1.4 seconds. This is reproduced on Plate 1 and transcribed in figure 1. The phrase of three descending tones is preceded by a groan like attack. The spectral graph presents a pattern of equidistant bands, corresponding to frequencies that remain virtually constant for the duration of the descending phrase. This is, in fact, true for the entire piece from which this example is drawn. An equidistant band pattern maintained throughout the changes in the whistle‑tone pitches suggests (a) that these are generated above a fundamental of constant pitch; and (b) that they are due to harmonic overtone generation, a predictable characteristic of wind instruments. Figure 2 shows in staff notation the approximate partials as they appear in consecutive order above a fundamental of about 100 Hz.
Fig. 2. (The‑tolerance of the filter permits only approximate readings of the frequency values.)
Most important to note here is not the precise distance between the bands or their absolute frequency value, but rather (a) the pitch vocabulary of the partials from which the melody tones are selected, namely the 6th to 13th partials but excluding the 11th;
and (b) the general range of the fundamental. As concerns the chöömij style, I would suggest that a physiological limitation prevents the singer from descending below the 6th or from ascending above the 13th partial if he wishes to isolate the desired melodic tones with sufficient intensity. The melodic style would seem to dictate the selection of tones agreeable to an anhemitonic penta scale widespread in Mongolian music, and this would naturally require the lowering of the 7th partial from f‑ to e’ and the avoidance of the 11th partial altogether,
Finally, the stable drone fundamental is in the author’s experience invariably selected from within the approximate range of G‑d,
The reason is that only this range permits the generation of a corresponding complement of partials that the mouth cavity can effectively filter.
Chöömij closely resembles borbannadyr, one of four Tuvin overtone singing styles described by A. N. Aksenov (3) that are largely characterized by the ranges in which they occur.
P. Crossley‑Holland(4) describes two styles of overtone chanting cultivated by the Tibetan monasteries of Gyume and Gyumo that are differentiated from chöömij by their placement in a somewhere lower range.
We have so far provisionally established the nature and vocabulary of tones comprising the chöömij style, the physiological mechanics for their production, their relationship to general acoustical laws, and their general frequency range. Our attention now turns to the ictus. In the graph, the ictus is represented as a successive development and decay of overtones. For reasons to be discussed, it is considered as a progression toward “normal” sustained chöömij timbre. The graph of the 1.4‑second‑long period of attack reveals an upward flowing glissando of overtone emphasis extending across a wide chöömij range, namely from the fundamental to the 10th partial. This dramatic upswing, accompanied by a smoother downward resolution of the 12th, 11th, 10th, and 9th partials into the 9th partial alone, is a composite of varied partial durations and intensities unfolding in time and resulting in an attack “shape.” We are dealing here with a complex of duration, intensity, overlapping, pitch, and grouping of partials. Aural perception is not one of an ascending glissando of individual overtone pitches, but rather of a gradual change of colour during the ictus from whose complex sound emerges the pure, whistle like b’’ sounding above the drone of G(5) Also, the 16th and 18th partials (1600 and 1800 Hz) appear at the end of the ictus and remain faintly present through to the end of the phrase. Our microanalysis, deliberately scrutinizing a 1.4‑second‑long detail, captures a delicate moment of vocal timbre which the singer of chöömij must effectively control in order to establish “normal” sustained chöömij sound. The ictus, representing a drive toward the sonal norm, isolated here for study, may well prove to be the key to a precise physiological explanation of this style(6).
Following our description of the ictus that precedes the unfolding of the melody, we now come to the “normal chöömij sound” as typified by the descending notes b”, a”, g”. The spectral configuration of the three descending whistle‑tones shown by the melogram during the 2.1 seconds following the ictus is here considered typical and representative of chöömij sound; or, to speak more objectively, the distinctive “nasal” quality pervading this style results from the spectral configurations shown by the melogram and presented schematically in figure 3. These show the sounding areas of the formants in relation to non sounding areas. Figure 3 shows three formant areas for chöömij: (1) the fundamental; (2) the melody area, 6th‑13th partials; and (3) a higher nasal area that is new to our description for the range of this style. This third formant lies in the 1500‑1600 Hz range in this excerpt, and is present as the’16th through 23rd partials in chöömij style generally. We have made the experiment of eliminating the third.formant, and have found that this effectively negates the nasal quality so typical of this style.‑ If the three formant areas in the arrangement presented by figure 3 are considered an accurate description of chöömij style, it suggests that a spectrum judged to he nasal has a non sounding “hole” in the area of 900‑1300 Hz. This further implies a more objective definition of our perception of nasality. In order to indicate the existence of a nonsounding hole, the range initially presented in figure 2 (1st through 13th partials) for the chöömij must be extended to include the area of the 16th through 23rd partials. They exist as a stable upper drone cluster of tones vital to maintaining the nasal character of the style and their existence may be a function of physiological necessity. Our recognition of the “nasal formant” as an integral part of the style thus provides a further possible clue to its vocal production. Attention to detail during the sustained tone production may give further insights into this problem. At the point where the melody descends from the b’’, the dovetailing of pairs of melodic overtones results in transitional areas where both can be heard simultaneously,. resulting in the interval of a major second. Further, in our own experience, the last note g” predominates on first hearing; however, after an examination of the melogram where the a” is seen to be simultaneously present, the interval of a major 2nd can be heard quite distinctly.
Fig. 3. A stylized diagram of chöömij vocal sound. The dotted lines refer to the melogram shown in Plate 1 above.
We may have here an indication of the degree of efficiency of the mouth cavity as a selective overtone filter. It is clear that effective filter width permits the passing of more than a single partial. The question then arises: Is a single melody note more likely desired by the human mechanism unable to produce it? Or, alternatively: Is it correct to end some phrases with a blend of two partials such that the performer is in fact adhering to a canon of style?
Further, the two pitches C and g” are accompanied by a rhythmic accent of the fundamental pitch. This accentuated accompaniment to melody tones occurs throughout this style. It might reasonably be anticipated that such accentuation would find some reflection in the display. Our melogram, however, shows no significant change in overall dynamics, such as would be typical of a push of air from the diaphragm. On the contrary, we find this dynamic swelling of the fundamental pitch to correspond to a strengthening of the 2nd and 3rd partials and, to a lesser degree, of the 5th. In reference to the physiological factors considered above, we could now ask what process involved in shifts of melodic whistle‑tones necessitates the emphasis of other partial groups. It must be considered further, however, whether this accenting is related to an unconscious physiological necessity of resetting the mouth cavity filter for emphasizing a different melody partial, or whether it might be a stylistic trait effected by an independent alteration of the mouth cavity consciously cultivated to accompany and punctuate pitch change. Or is it both? The answers to these questions necessarily await further research.
Finally, the overall dynamic graph peaks during the initial attack and remains unusually stable during the length of the phrase (8). The stability of this graph during notes of long duration suggests an ability on the part of the singer to supply constant air pressure to the vocal mechanism producing the fundamental pitch. This may be another consciously cultivated feature.
The latter part of this article emphasizes the relevance of melographic analysis to the physiological processes of voice production. It would be fascinating to go further and to add computer facilities. It might then be possible to calculate a progression of mouth and nasal cavity configuration corresponding with the normal vocal style (9). When this can be realized, it may well bring a new dimension into the objective study of musical styles.
* chöömij (Hans‑Peter Vietze, Lehrbuch der Mongolischen Sprache [leipzig: VEB Verlag Enzyklopädie 1969] , pp. 15‑16)
or khöömii (J. E. Bosson, Modern Mongolian [Bloomington: Indiana University, 19641, P. 11) are two possible transliterations for the Mongolian “xөөmий” which in Khalkha dialect means pharynx; throat; windpipe (A. Luvsandendev, Mongol’sko‑russkii slovar [Moscow: Gos. Izd‑yo Inostrannych i Natsional’nych Slovarej, 1957], p. 553). In Classical Mongolian it is written K ØGEMEI, Which means pharnyx; throat (F. Ussing, Mongolian‑English Dictionary [Berkeley: University of California Press, 1960], p. 479). Aksenov (1964) writes chöömij and Vargyas (1968) hö‑mi.
1. The comparison of chöömij with the Jew’s harp was suggested by Lajos Vargyas, “Performing Styles in Mongolian Chant,” in
Journal of the International FoLk Music Council XX (1968), 70‑72.
2. Professor D. Dinowski of the Ethnology Department, University of Warsaw, has kindly facilitated a study of this material.
3. “Die Stile des tuvinischen zweistimmigen Sologesanges,” in Soujetische Volkslied‑und Volksmusik. forschung. Erich Stockmann, ed.
(Berlin: Akademie Verlag, 1967). Pp. 293‑308.
4. Notes to the recording, “The Music of Tibet: The Tantric Rituals,” Disk AST‑4005, New York, Anthology 1970. Musical analysis by
Peter Crossley‑Holland; acoustical analysis by Kenneth N. Stevens.
5. This was investigated through a synthesis of this same excerpt on a generator of sine‑tones produced through a process using
insulated light. This apparatus was constructed by Dr. K. Schiigerl, Phonogramm‑archiv, Vienna, in 1970.
6. This topic is under study by Dr. Frank, Laryngologisches InstitiA, Vienna.
7. This result is based on filtration experiments carried out with the help of Dr. R. Brandi, Phonogrammarchiv, Vienna, 1970.
8. It is the opinion of Mr. Michael Moore, based on the perusal of a large number of melographs, that the dynamic display shows little
fluctuation when compared with other vocal sty les.
9. Apparatus of this nature already exists and is being further refined and developed by Dr. P. Ladefoged in the Phonetics Laboratory at
Overtone singing, a technique of Asian origin, is a special type of voice production resulting in a very pronounced, high and separate tone that can be heard over a more or less constant drone. An acoustic analysis is presented of the phenomenon and the results are described in terms of the classical theory of speech production. The overtone sound may be interpreted as the result of an interaction of closely spaced formants. For the lower overtones, these may be the first and second formant, separated from the lower harmonics by a nasal pole‐zero pair, as the result of a nasalized articulation shifting from /c/ to /a/, or, as an alternative, the second formant alone, separated from the first formant by the nasal pole‐zero pair, again as the result of a nasalized articulation around /c/. For overtones with a frequency higher than 800 Hz, the overtone sound can be explained as a combination of the second and third formant as the result of a careful, retroflex, and rounded articulation from /c/, via schwa /E/ to /y/ and /i/ for the highest overtones. The results indicate a firm and relatively long closure of the glottis during overtone phonation. The corresponding short open duration of the glottis introduces a glottal formant that may enhance the amplitude of the intended overtone. Perception experiments showed that listeners categorized the overtone sounds differently from normally sung vowels, which possibly has its basis in an independent perception of the small bandwidth of the resonance underlying the overtone. Their verbal judgments were in agreement with the presented phonetic‐acoustic explanation.
Electroglottogram recordings during overtone singing
Gerrit Bloothooft, Guus de Krom, Susan Jansen, Allard van der Heijden
Research Institute for Language and Speech (OTS) Utrecht University Trans 10, 3512 JK Utrecht, The Netherlands
Overtone singing involves careful articulation, resulting in a closely spaced formant pair (F1/F2 or F2/F3) that enhances the amplitude of the intended overtone considerably. Apart from this articulatory explanation it is likely that the glottal sound source plays an important role during overtone singing, but this has never been explicitly investigated so far. To this end we have made electroglottograms (EGG) from an experienced singer from the Tuva Republic and from a Dutch teacher of overtone singing. Khargira and sygyt techniques and some variants were recorded, for the authentic singer during songs, and for the Dutch singer as systematic scales of overtones. Whereas normally sung vowels showed standard shapes of the EGG, the shape of the EGG deviated considerably during overtone singing, for both singers. Instead of a single full wave per period, the EGG showed modulations during a period with higher frequency components. We will present an analysis of these modulations in relation to the frequency of the amplified overtone. A comparison is made between the two singers and the different and comparable overtone singing techniques they recognized.
Dr Gerrit Bloothooft Research Institute for Language and Speech (OTS) Utrecht University Trans 10, 3512 JK Utrecht, The Netherlands Phone: +31.30.536042 Fax: +31.30.536000 Email: email@example.com