VOWEL NORMALIZATION IN SOCIOPHONETICS: WHEN, WHY, HOW?

VOWEL NORMALIZATION IN
SOCIOPHONETICS: WHEN, WHY,
HOW?
Anne Fabricius
Sociolinguistics Circle, Copenhagen University, September 16,
2008
1
Outline of today’s talk
Part 1
 Vowel Normalization: what is it?
 When and why normalize formant data?
 Typology of Normalization methods
 Normalising using the NORM website
Part 2
 Comparisons between The S-procedure and others
 Data
 Methods; Three test comparisons
 Results
 Discussion and Conclusions
(This part was recently presented at ASA08, Paris with Dom Watt
and Dan Johnson)
2
Vowel normalization:
what is it?
 A process by which acoustic data (vowel
formants) is made comparable across groups
of individuals
 by the elimination of differences between
individuals’ acoustic output due to the fact
that we all have individual vocal tracts
 Men for instance have on average larger vocal
tracts than women do, so the resonances
they produce will be lower than those of most
women
3
So when and why do you need to
normalize?
 To eliminate acoustic variation in vowel
measurements due to physiological differences
among speakers (differences in head/vocal tract
sizes).
 At the same time you want to preserve
sociolinguistic/dialectal/cross-linguistic differences
in vowel quality to find genuine variation (and
perhaps evidence of change in progress)
 (To model the cognitive processes that allow human
listeners to normalize vowels uttered by different
speakers): this is more important for phoneticians
than for sociophoneticians
4
Typology of
normalization methods
 Intrinsic versus extrinsic methods (describes the
range of information used, one only versus all in
a system)
 Parameters of (Speaker), vowel, formant
 Methods in all combinations
 Some assume ’scaling’, some don’t...
 Adank (2003) tested a large sample of standard
Dutch vowels from 8 regions in the Netherlands
and used it to test 12 normalization algorithms
5
Typology of normalization
methods
 Her most successful methods in terms of best
performances at Linear discriminant analyses
(eliminating different types of ’noise’ in the data and
preserving sociolinguistic information) were
speaker intrinsic
vowel extrinsic
formant intrinsic
 Lobanov Z-score,
 Nearey individual log-mean (CLIHi4 in Adank 2003)
 And (we could add) the S-procedure (Watt and
Fabricius 2002)
 (more in a minute)...
6
First: NORMalizing made simple
7
The NORM form
8
And the NORM output...
9
To find out more about NORM
 Google ’NORM SUITE’
 Uses Open source programming in R
 Presents 5 major methods with variants under some
models
 Simple structured text files as input (model shown)
 Explains methodological drawbacks and advantages
for each (Methods page)
 NB Plotting in NORM is mediated by another
programme (whose effects can obscure what you
want to see – other options are Excel/Plotnik -for
Mac only, using numerical results)
10
Moreover....
 Tyler Kendall happily fields questions about
NORM!
 The system is easy to use once you get the
hang of it but can be tricky to start off with....
 Like Praat.... 
11
Part 2: Introducing the S-procedure

Watt & Fabricius (2002) published a description of
vowel formant frequency normalization procedure
based on DW’s work with British English vowel
variation

Based on estimates of F1 and F2 maxima and
minima taken for each speaker in sample

Calculates a centroid S (after Koopmans-van
Beinum 1980) derived from these 3 corner points
(cf Bigham 2008 using 4)

all individual formant measurements then
expressed relative to S
12
2
mW&F (F2 only)
F2 (Hz)
i
S
u'
F1 (Hz)
min F1,
max F2
min F2
(= min F1)
a
max F1
13
Goals and Research Questions,
Part 2
 Road-test the S-procedure against other vowel-
extrinsic/formant intrinsic methods
 Develop and refine some comparison procedures
focussed on the visual comparison criterion relevant to
sociophonetics,
 Q: How well does S-centroid (W&F) perform,
compared to Lobanov and Nearey at
 Equalizing vowel space areas for multiple speakers
 Improving intersection of vowel polygons
 Preserving spatial relationships (juxtapositions)
between vowel means compared to raw Hertz
14
Data
 RP data, 20 speakers from two independent
sources
 Male speakers (Hawkins and Midgley 2005): 5
oldest group born 1928-1936, 5 youngest group
born 1976-1981
 Female speakers: Matching age groups, first 5
speakers in each group from Moreiras 2006
(dissertation UCL).
 Aberdeen data
 6 speakers (3 male, 3 female) born between 1945-
1986 (Watt & Yurkova 2007)
15
Methods
 Normalization using NORM suite
http://ncslaap.lib.ncsu.edu/tools/norm/norm.php
 Watt and Fabricius, Lobanov (speaker intrinsic)
and Nearey (speaker-intrinsic) routines without
‘scaling factor’
 Alteration to W&F, here mW&F coded by DJ
 Areas of individual vowel spaces calculated using
R package gpclib (*)
 (‘http://www.cs.man.ac.uk/~toby/alan/software/).
16
Three Test Procedures
 Test One evaluated reduction of variation
among vowel space areas.
 Used comparisons of Squared Coefficients of
Variation (SCV) to derive each method’s proportional
reduction of variation relative to Hertz SCV
 Pitman-Morgan’s test of homogeneity of variance
between correlated samples, which tests whether the
dispersions become significantly smaller across
normalization methods
17
Three Test Procedures
 Test two evaluated improvement in vowel
polygon co-extensiveness
 Co-extensiveness defined as intersection of two
polygons divided by union of same polygons
 Paired t-tests comparing across methods
18
Three Test procedures
 Test three observed intra-speaker vowel
juxtapositions across normalization methods
 Vowel space perimeter angles
 RP DRESS-LOT juxtaposition (relatively stable
diachronically)
 RP TRAP-STRUT and LOT-FOOT juxtapositions
(known to be changing over time, Fabricius 2007a and
b)
 (mW&F not tested here; RP data only)
 To see how the various methods affect angles
compared to raw Hertz data - exploratory
19
Results- Test one
Improvement
Nearey
W&F
mW&F Lobanov
RP
0.071
0.350
0.389
0.923
Aberdeen
0.670
0.877
0.865
0.970
over Hertz
Proportional Reduction of Area Variance
20
Results- Test one
Hertz
Nearey
W&F
mW&F
Lobanov
Hertz
Nearey
W&F
mW&F
Lobanov
**
**
*
*
ns
*
ns
*
ns
*
*
ns
ns
*
ns
ns
**
**
**
**
-
Pitman-Morgan test of significance of dispersion
differences
RP (N=20)
Aberdeen (N=6)
p<0.05 *
p<0.001 **
21
Results- Test two
Hertz
Nearey
W&F
mW&F
Lobanov
RP
.380
.445
.452
.500
.564
Aberdeen
.444
.583
.598
.618
.658
Average vowel space overlaps, RP and Aberdeen data
22
Results- Test Two
Hertz
Nearey
W&F
W&F1
Lobanov
Hertz
Nearey
W&F
mW&F
Lobanov
**
*
*
*
*
ns
ns
**
*
ns
*
*
**
**
**
ns
**
*
**
*
-
Paired t-tests, p<0.05 *, p<0.001 **
RP (N=20)
Aberdeen (N=6)
23
Results- Test Three
20 RP
speakers
Hz
Nearey
W&F
W&F/
Hz
49
2.12
(0.52)
195
1.00
(0.05)
GOOSEFLEECE-KIT
(Std Dev)
FLEECE-KITDRESS
26
26
(Std Dev)
195
195
mW&F
mW&F/
Hz
49
2.17
(0.52)
195
1.00
(0.05)
Lobanov
Lobanov
/Hz
52
2.40
(0.75)
193
1.00
(0.05)
Average Perimeter angle values across normalization methods, RP data
24
Results- Test Three
DRESS-LOT Juxtaposition: Nearey-normalised and Hertz
40,00
30,00
Angle in Degrees
20,00
10,00
Nearey without scaling
0,00
Hertz without scaling
-10,00
-20,00
-30,00
0
2
4
6
8
10
12
14
16
18
20
Speakers 1-20, RP
25
Results-Test three
Speakers
Hz
Nearey
W&F
W&F
/Hz
Lobanov
Lobanov
/Hz
TRAPSTRUT
Older
2,44
2,43
5,38
2,93
3,63
3,68
TRAPSTRUT
Younger
40,81
40,82
66,44
1,76
67,31
1,78
LOT-FOOT
Older
32,11
32,09
13,64
0,39
10,85
0,32
LOT-FOOT
Younger
80,94
80,92
65,60
0,81
64,54
0,79
Angle juxtapositions across normalizations
26
Conclusions
 Test One- Area ratios
 Lobanov> W&F, mW&F > Nearey
 Test Two- Co-extensiveness
 Lobanov> mW&F > W&F, Nearey > Hertz
 Test One and Two Combined
 Lobanov > mW&F >W&F > Nearey
 Test Three - Angles
 Nearey> W&F> Lobanov
27
Conclusions
 Best practice choices of normalization
methods for sociophonetics are neither
straightforward nor unidimensional!
 Method choice should be grounded in
 a thorough knowledge of what each method
achieves, and
 consideration of the aim of the investigation, as
well as the nature of the data
28
Forthcoming publication
BEST PRACTICES IN SOCIOPHONETICS
MARIANNA DI PAOLO AND MALCAH YAEGER-DROR
UNIVERSITY OF UTAH UNIVERSITY OF ARIZONA
Routledge, 2009
Including a chapter on normalization...
29
Acknowledgements
 Dominic Watt
 Daniel E. Johnson, both University of York
 Tyler Kendall, Duke University, NC.
 Caroline Moreiras and Bronwen Evans, UCL, for
access to Moreiras’ (2006) unpublished RP vowel
formant data,
 Jillian Oddie, UYork, for recording the Aberdeen data
and carrying out analysis,
 Victoria Watt with help with the Aberdeen data, and
Bernhard Fabricius for help with mathematical and
programming tasks.
30
References 1

Adank, P. 2003. Vowel Normalization: A perceptual-acoustic study of Dutch Vowels. Ph.D. thesis, Katholieke
Universiteit Nijmegen.

Bigham, D. 2008. Dialect contact and accommodation among emerging adults in a university setting. Ph.D. thesis,
The University of Texas at Austin.

Cohen, A. 1990. Graphical Methods for Testing the Equality of Several Correlated Variances. The Statistician. 39, 1:
43-52.

Deterding, D. Speaker normalization for automatic speech recognition. Ph.D. thesis, University of Cambridge, 1990

Deterding, D. 1997.The Formants of Monophthong vowels in Standard Southern British English Pronunciation.
JIPA. 27: 47-55.

Disner, S. 1980. Evaluation of vowel normalization procedures. JASA. 67:253:61.

Fabricius, Anne. 2007a. Variation and change in the TRAP and STRUT vowels of RP: a real time comparison of five
acoustic data sets. Journal of the International Phonetic Association 37:3: 293-320.

Fabricius, A. 2007b. Vowel Formants and Angle Measurements in Diachronic Sociophonetic Studies: FOOT-fronting
in RP. Proceedings of the 16th ICPhS, Saarbrücken, August 2007. 1477-1480. www: www.icphs2007.de/.

Hawkins, S & Midgley, J. 2005. Formant frequencies of RP monophthongsin four age groups of speakers. JIPA. 30:
63-78.

Kamata, Miho. 2008. An acoustic sociophonetic study of three London vowels. Ph.D. thesis, University of Leeds.

Koopmans-van Beinum, F. 1980. Vowel contrast reduction: an acoustical and perceptual study of Dutch vowels in
various speech conditions. Ph.D. thesis, University of Amsterdam.
31
References 2

Labov, William. 1994. Principles of Linguistic Change. Volume 1: Internal Factors. Oxford, UK/Cambridge,

Labov, William, Ash, Sharon, and Boberg, Charles. 2006. The Atlas of North American English: Phonology,
Phonetics, and Sound Change. A Multimedia Reference Tool. Berlin: Mouton de Gruyter.

Lobanov, B.M. 1971. Classification of Russian vowels spoken by different speakers. JASA 49(2B): 606-8.

Moreiras, C. 2006. An acoustic study of vowel change in female adult speakers of RP. Unpublished undergraduate
dissertation, University College London.

Nearey, T. 1977/8.Phonetic feature systems for vowels. Dissertation, University of Alberta. (published 1978 by
Indiana University Linguistics Club).

Thomas, E. 2002. Instrumental Phonetics. In J.K. Chambers, Peter Trudgill and Natalie Schilling-Estes. The
Handbook of Language Variation and Change. Oxford, UK/Malsen, MA: Blackwell. 168-200.

Thomas, E. & Kendall, T. 2007. NORM: the Vowel Normalization and Plotting Suite. URL:
<http://ncslaap.lib.ncsu.edu/tools/norm/index.php>

Traunmüller, H. 1997. Auditory scales of frequency representation. Online at
http://www.ling.su.se/staff/hartmut/bark.htm

Watt, D. & Fabricius, A. 2002. Evaluation of a technique for improving the mapping of multiple speakers’ vowel
spaces in the F1 ~F2 plane. Leeds Working Papers in Linguistics and Phonetics 9: 159-73.

Watt, D. and Tillotson, J. 2001. A spectrographic analysis of vowel fronting in Bradford English. English World-Wide.
22(2):269-302.

Watt, D and Yurkova, J, 2007. Voice Onset Time and the Scottish Vowel Length Rule in Aberdeen English.
Proceedings of ICPHS 16, Saarbrücken, Germany. 1521-1524. www: www.icphs2007.de/.

Wells. J.C. 1982. Accents of English (3 vols). Cambridge: Cambridge University Press.
32
Thank you for your attention
(The Watt and Fabricius Silver Medal)
33