Alexander V. Kharuto

Tuva Throat Singing: Acoustical Analysis and Model of Sound Production

Moscow P.I.Thaikovsky Conservatory

Bolshaya Nikitskaya str., 13, Moscow, Russian Federation

Tel: +7(495)2906092 (of.)



The phenomenon of Tuva throat singing (khoomei) is well-known  worldwide. During this singing, the voice of one man will be percept as two or more voices. The mechanism of sound production in this singing is not quite clear. In khoomei, one can hear at least two ‘voices’: a lower (bass) and a higher, which sounds like flute. The sonogram includes an equidistant overtone system which step is equal to the pitch of lower ‘voice’.  

According to so-named ‘overtone theory’, the melody of higher khoomei voice will be formed with help of movement of high-frequency formant, and all the overtones which ‘needs’ the performer for sounding will be produced by his Ferrein’s cords. Another theory, based on experimental materials, proves that in ‘two-voice’ sounding the false (vestibular) vocal cords form a kind of whistle, which generates independent a high sound. The mechanism of building an equidistant overtone system in this case has been not explained in the theory.

In this paper, a new explanation of khoomei-sounding will be given, which is based on signal theory and accords to facts fixed in previous investigations. Our model includes a low-frequency oscillator, which controls (manipulates with) the second one, high-frequency oscillator. The resulting spectrum contains only spectral components on frequencies, which are multiple of frequency of low-frequency oscillator. A high-frequency formant with changing position will be formed through this mechanism, also.


In khoomei investigations, there are crossing interests of humanities and natural sciences — physicists, acoustics, physiologists, physicians. One of the most intriguing questions now is the mechanism of sound production in throat singing. The singing of Tibet lamas, which is alike the Tuva’s one, has been analyzed for the first time by English acoustics Smith, Stevenson and Tomlinson in 1967. Similar investigation results have been published by A. Banin and V. Lozhkin in 1973 [1]. They described the process of khoomei ‘melody’ forming as a persistent sounding of a low-frequency component (burdon sound) and  the presence of high-frequency formant which changes its position and ‘highlights’ the spectrum overtones needed by performer. The overtones are produced by Ferrein’s cords in performer’s throat. Later, this theory has been named ‘overtone theory’.

The possibility of producing a very great number of overtones itself  on the basis of a low-frequency tone was doubted by other specialists. Comparative to academic singing, for example in Shaliapin’s voice, the number of overtones may be as high as in khoomei, but the effect of two-voicing does not exist.

For further investigations in this problem, a group of phoniatrists and vocal specialists has been formed (1975) and experimental acoustical and physiological studying of khoomei singers have been fulfilled. Photographic pictures and X-ray pictures (1976) of singing throat have been made [2]. The study showed that during ‘two-voice’ sounding vestibular cords, which are positioned before the Ferrein’s cords, build a ‘whistle’. At this moment, the singer amplifies the expiration and generates a 2-4 kHz tone, which is ‘some octaves higher then the main lower tone’ [2]. The lower tone sounds also at this time. Relying on these facts, the authors of [2] denied the ‘overtone theory’ [1].

On fig.1, the characteristic spectrum of khoomei sonogram is shown (style ‘sygyt’). Horizontal axis is time, and vertical axis is frequency. On the right part of sonogram, a graph of momentary spectrum is superimposed for the time moment t = 5 s. (The spectrum power increases from right to left.) At the time moment t = 3.75 s the performer finishes the ‘recitative’ part, which sounds like hoarse singing and begins the ‘vocalize’ part with its ‘two-voicing’. (All the spectral graphs have been formed with help of author’s program SPAX, especially developed for such investigations.)

A sonogram loke shown on fig.1 was published in [2], and the authors of [2] proved on this base the presence of ‘two independent mechanisms’ of sound production. From my author’s point of view, these ‘presence of two mechanisms’ can not be proved with, because we see on sonograms appearances of only one of it, the low-frequency component, which builds the first harmonic on the burdon frequency and a row of higher harmonics on multiple frequencies. The presence of a second ‘independent’ sound source would produce any spectrum line or a system of harmonics. If these harmonics coincide with the spectrum components of the first source — this fact must be explained.


Fig. 1. Sonogram of khoomei sound (style ‘sygyt’)


Possible interactions of two sound sources have been studied in the work of specialists from Tuva [3]. In this article, the authors assert, that the two voice oscillators will be self-synchronized in some way: ‘In a system which contain two connected sound sources in throat, which have neighboring oscillation frequencies, according to synchronization theory a stabilization effect of base tone occurs’ ([3], P.379). However we see no stringency in this conclusion.

Based on new results with the use of computer sound analysis and khoomei sound simulation, the author offers another explanation of the mechanism of sound production in khoomei. This explanation is also based on theory of signals and facts which have been fixed in previous investigations. These are: 1) presence of only one system of equidistant harmonics in khoomei sonograms, calculated in our work and also by other authors; 2) forming of a ‘whistle’ in the throat of singer for generating of a tone 2-4 kHz (this has been simulated physically in [2]).

In the new model, the author offers the following mechanism of ‘two-voice’ sound producing. The Ferrein’s cords block the expired air stream with a certain rhythm. The vestibular cords, which build the ‘whistle’, answer every air pressure pulse with an aerodynamic whistling on frequencies 2-4 kHz. As result, at the ‘output’ of vocal tract a sequence of pulses will be formed with the period of oscillations of Ferrein’s cords, and the pulses are ‘filled’ with high frequency oscillations produced by vestibular cords. The study of nature khoomei oscillations confirms this model. On fig.2, a fragment of oscillogram from ‘vocalize’ part of performance is shown. (The time scale is enlarged in comparison with fig.1.) These oscillations have the character of sinusoid, which is amplitude-modulated with pulses with the modulation coefficient near to 100%. The pulse duration (between two adjacent valleys) is equal to period of main tone (its frequency measured from the spectrum is f1 = 185 Hz — see the lower line on sonogram on fig.1).



Fig. 2. Oscillogram of khomei sound during the ‘vocalize’ part (see. fig.1, time moment t = 4,5 s)


In signal theory, the spectrum of such oscillations will be build as following. A sequence of pulses, modulates on amplitude oscillations with frequency .  Each of  pulses has a spectrum . Therefore, the resulting oscillation will has the spectrum



The scheme of forming such a spectrum is shown on fig.3.













Fig. 3. Spectrum of one pulse (a), spectrum of a periodical pulse sequence (b) and spectrum of sinusoidal oscillation which has been modulated with the periodical pulse sequence (c)


The resulting spectrum has a double width in comparison with that of pulse sequence, and his center position moved to the frequency . In this way appears the ‘formant’ in the area of higher frequencies. It is also known [4] that if a low-frequency oscillator with the pulse frequency  manipulates another one, which oscillations appear at the beginning of a pulse and ends when the pulse finishes, then the resulting spectrum will contain only components on frequencies , where k = 1,2,3,…  is harmonic number. Therefore, this spectrum will be build only with harmonics of low-frequency source which controls the second oscillator. In this case, the component with frequency  can be absent in spectrum: we can see such situation on fig.1 at the moment t = 4,5 s, when formant changes its position.

The result of computer simulation of a pulse sequence which is ‘filled’ with high-frequency oscillations is shown in form of sonogram on fig.4. The ‘envelope’ of resulting oscillations is alike fig.2. The pulse frequency is equal to 165 Hz; the high frequency of the ‘second oscillator’ (simulating vestibular cords) is firstly 2000 Hz, then 2150 Hz, and at the end 2000 Hz. The most powerful harmonics groups around the frequency of ‘second oscillator’ and forms a spectrum ‘formant’, which position changes and depends from the high frequency of ‘second oscillator’. Movements of this formant effects the ‘selecting’ of some spectrum overtones, but the overtones itself does not move: their frequencies are still . This simulation corresponds with real khoomei sonogram shown on fig.1.





Fig. 4. The sonogram of computer simulated pulse sequence with constant repetition
frequency (165 Hz) and changing high-frequency ‘filling’ oscillations.


On the sonogram on fig.1 one can see also some low-frequency harmonics (which are not present in the spectrum on fig.3c). Maybe, this effect can be explained through incomplete blocking of air stream by vestibular cords, which builds the ‘whistle’. In this case, a part of air pulses formed on Ferrein’s cords will pass direct to the oropharyngeal horn, and then the spectrum will contain some components from ‘usual’ voice pulses and also from high-frequency pulses formed by vestibular cords. As it has been shown above, there not any dissonance will appear, because the harmonics of both sources are multiple to lower frequency .

Consequently, our model of interaction of two oscillating systems in vocal tract during ‘vocalize’ in Tuva’s throat singing performance explains the main effects which one can hear and measure in spectrum study of this sound.



  1. Banin A.A., Lozhkin V.N. Ob akusticheskih osobennostyah tuvinskogo sol’nogo dvuhgolosiya (About acoustical peculiarities of Tuva’s solo two-voice singing) //VIII Russian national acoustical conference. Moscow, 1973 (in Russian).
  2. Dmitriev L.B., Tchernov B.P., Maslov V.T. Taina tuvinskogo «dueta» ili svoistvo gortani cheloveka formirovat’ mehanizm aerodinamicheskogo svista (The secret of Tuva’s ‘duet’, or the property of man’s throat to form aerodynamical whistle). Novosibirsk, 1992 (in Russian).
  3. Ondar M.A.Kh., SarYglar A.S. O fizicheskoi prirode zvukov tuvinskogo gorlovogo peniya (About physical nature of sounds of Tuva’s throat singing) // In: Problems of study of cultural history of folks in Central Asia and adjacent regions: Proceedings of International scientific and practical conference in Kysyl, 5-8 September, 2005. Kysyl, 2006, p.380-381. (in Russian).
  4. Gonorovsky I.S. Radiotehnicheskie tsepi i signaly (Radio engineering: circuits and signals). Part 2. Moscow: Sowetskoe radio, 1967.