Languages & Linguistics

Fundamental Frequency

The fundamental frequency refers to the lowest frequency of a periodic waveform, such as a sound wave produced by the human voice. In linguistics, it is associated with the pitch of a spoken utterance and is crucial for conveying intonation and meaning in language. The fundamental frequency is measured in hertz and plays a significant role in phonetics and speech analysis.

Written by Perlego with AI-assistance

4 Key excerpts on "Fundamental Frequency"

  • Book cover image for: Speech Acoustic Analysis
    • Philippe Martin(Author)
    • 2020(Publication Date)
    • Wiley-ISTE
      (Publisher)
    The laryngeal frequency can vary considerably during phonation and can extend over several octaves. In extreme cases, transitions from 100 Hz to 300 Hz (change from normal phonation to falsetto mode) can be observed during an interval of two or three cycles.
    Furthermore, successive cycles can show variations of several percent around a mean value, depending, among other things, on the physiological state of the muscles involved in the vibration mechanism. Even direct observation (for example, by rapid cinematography) does not always allow precise identification of the cycle beginnings, due to creaky voice, breath, etc. This may result in errors that are difficult to minimize.
    The name “Fundamental Frequency”, given to the acoustic measurement of laryngeal vibration, derives from the similarity with the same term given to the base frequency in a Fourier analysis, in other words, the presence of frequency harmonics integer multiples of the fundamental. This can sometimes result in confusion, which the context is not always sufficient to resolve.
    F0 can be measured from the speech signal in the time domain, for example after signal filtering, or in the frequency domain, from the Fundamental Frequency (in the Fourier sense) of a voiced sound. The successive variations in F0 values over time are plotted in the graph to determine a so-called pitch curve, produced during phonation (Figure 7.6 ). This pitch curve conventionally displays null values at segments of unvoiced speech or silence.
    Figure 7.6.
    Example of a pitch curve displayed as a function of time and varying from approximately 95 Hz to 260 Hz
    The difficulty in measuring the Fundamental Frequency is largely due to the fact that, strictly speaking, there are no glottic vibration cycles, but rather the recurrence of a movement that is controlled by numerous parameters (adductor and tension muscles controlling the vocal folds, pressure under the glottic, etc.). The speech signal which the measurement is made from is the result of the complex interaction of glottic stimulation and temporal variations in the shape of the vocal tract.
  • Book cover image for: Speech Production and Language
    eBook - PDF

    Speech Production and Language

    In Honor of Osamu Fujimura

    • Shigeru Kiritani, Hajime Hirose, Hiroya Fujisaki, Shigeru Kiritani, Hajime Hirose, Hiroya Fujisaki(Authors)
    • 2013(Publication Date)
    Fundamental Frequency rule for English discourse Noriko Umeda 1. Introduction The paper describes a Fundamental Frequency (Fo) rule for readings of unlim-ited texts as a part of a rule-synthesis program running in our laboratory. The rule produces smooth and natural sounding contours for unlimited English speech material. Unlike Fo rules developed by other researchers, which were derived from a number of short unrelated sentences, the rule described here was derived from the analysis of F 0 contours of continuous essay readings by various talkers. Analyses were done in terms of discourse structure (Umeda 1982a), boundary situations (Umeda—Quinn 1980; Umeda 1982a; Umeda 1982b), syllabic and stress conditions of the word (Umeda 1981a; Umeda 1982b), some semantic considerations, such as frequency of occur-rence of the word and the difference between new and old ideas (Coker— Umeda 1971; Umeda 1982a; Umeda 1982b), and the intrinsic nature of F 0 according to the identity of phonemes in fluent speech (Umeda 1981a). The rule tries to incorporate all the consistencies found in the data analyses and produces quite natural Fo contours for unlimited speech material of an essay-reading style. The primary goal of the rule is to produce good Fundamental Frequency contours for such large material without noticeable machine ac-cent, and not to prove a certain physiological or theoretical model of glottal behaviors. The performance of the rule has been auditorily tested using an LPC (Linear-Predictive Coding) vocoder (Atal—Hanauer 1971). The test has demonstrated that the rule produces Fo contours as natural as the Fo pro-duced by a human talker. Each language has its own surface structure of rhythm and intonation to realize phonological parameters such as interphrasal relationships, syllable structure and constraints of words, stress or accent, and phoneme sequential constraints.
  • Book cover image for: Speech Enhancement
    eBook - ePub

    Speech Enhancement

    Theory and Practice, Second Edition

    • Philipos C. Loizou(Author)
    • 2013(Publication Date)
    • CRC Press
      (Publisher)
    Chapter 3 , the opening and closing of the vocal folds during voicing produces periodic waveforms (voiced segments of speech). The time duration of one cycle of the vocal folds’ opening or closing is known as the fundamental period, and the reciprocal of the fundamental period is known as the vocal pitch or Fundamental Frequency (F0). The Fundamental Frequency F0 varies from a low frequency of around 80 Hz for male speakers to a high frequency of 280 Hz for children [29]. The presence or absence of periodicity signifies the distinction between voiced and unvoiced sounds (e.g., between [d] and [t]). The F0 periodicity is also responsible for the perception of vocal pitch, intonation, prosody, and perception of lexical tone in tonal languages.
    FIGURE 4.10 Example FFT-magnitude spectra of the vowel /eh/ corrupted by multitalker babble at 0 and 5 dB S/N. The noisy-magnitude spectra were shifted down for better clarity.
    The voiced segments of speech (e.g., vowels) are quasi-periodic in the time domain and harmonic in the frequency domain. The periodicity of speech is broadly distributed across frequency and time and is robust in the presence of noise. Figure 4.10 shows the FFT spectra of the vowel /eh/ in quiet and in noise. It is clear that the lower harmonics are preserved in noise, at least up to 1 kHz. This suggests that listeners have access to relatively accurate F0 information in noise. Such information, as we will discuss later (Section 4.3.3), is important for understanding speech in situations where two or more people are speaking simultaneously [32].

    4.2.4 RAPID SPECTRAL CHANGES SIGNALING CONSONANTS

    Unlike vowels, consonants are short in duration and have low intensity. Consequently, they are more vulnerable to noise or distortion, compared to vowels. The duration of the /b/ burst, for instance, can be as brief as 5–10 ms. Vowels on the other hand can last as long as 300 ms [33].
    Formant transitions associated with vowel or diphthong production are slow and gradual. In contrast, consonants (particularly stop consonants) are associated with rapid spectral changes and rapid formant transitions. In noise, these rapid spectral changes are preserved to some degree and serve as landmarks signaling the presence of consonants. Figure 4.11 shows example spectrograms of a sentence embedded in +5 dB babble noise. Note that the low-frequency (and intense) vowels alternate frequently with the high-frequency (and weak) consonants, resulting in sudden spectral changes. For instance, the low-frequency spectral dominance at near 500, 1000, and 1700 ms is followed by a sudden dispersion of spectral energy across the spectrum. These sudden spectral changes coincide with the onsets of consonants. Although for the most part the high-frequency information is smeared and heavily masked by the noise, the onsets of most of the consonants are preserved (see arrows in Figure 4.11
  • Book cover image for: Early Language Acquisition of Mandarin-Speaking Children
    • Yunqiu Zhang(Author)
    • 2019(Publication Date)
    • Routledge
      (Publisher)
    The latest ver-sion was changed to the name SpeechLab. The present study used version v0.5. 7 The original codes are rather complex and are omitted here. 8 According to analysis on the measurement data in this study by Meng (2006a), 2,000 ± 300Hz is not an effective reaction region for Mandarin vowels. Only three non-major vowels, such as ü, have a second formant F2 near 2,000Hz. Further research is needed to investigate the constraining relationship between the phenomenon of a weak reaction region and its effect on vowel production from a physiological perspective and vowel perception from an acoustic perspective. 9 This present research does not consider the problem of vocalization type. 10 Also known as rang. The relative range of variation in the present research refers to the relative differences between the extreme values. 11 Fundamental Frequency data in this figure were drawn from Table 1. High bounds represent the means of female F0, lower bounds represent the means of male F0, the middle points are means averaged across genders. The p -values at the top are from t -tests, with the original F0 data and N = 4. 12 Due to the atrophy and thinning of the oral mucosa and the enlargement of the volume of the vocal tract, the vocal resonance and laryngeal airway resistance also change. These may be the causes of such a phenomenon. See Han et al. (2007). Sangyin Yixue [ 嗓音医学 ] (p. 49). Beijing: People Health Publishing House, for details. 13 Kenneth N. Stevens (1998). Acoustic phonetics (p. 12). Cambridge, MA; London Eng-land: MIT Press. In addition, there is no research data regarding children’s vocal tract and vocal cords; therefore we cannot discuss the quantified relationship between fun-damental frequency and physiological changes during the period of voice change and can only discuss such a relationship in a qualitative manner. 14 Underage is marked as u while adult is marked with a. Same below.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.