Christos Christodoulopoulos, Cynthia Fisher and Dan Roth

Exploring the assumptions of
language acquisition models
Christos Christodoulopoulos, Cynthia Fisher and Dan Roth
Midwest Speech & Language Days 2015
Models of language acquisition
“The girl chases the boy”
2
Models of language acquisition
“The girl chases the boy”
2
Models of language acquisition
“The girl chases the boy”
2
Models of language acquisition
“The girl chases the boy”
“The boy runs”
2
Semantic Role Labeling
PropBank corpus
[Palmer et al. 2005]
Core arguments:
A0 - Agent
A1 - Patient
A2 - Recipient
…
Modifiers:
Locative
Temporal
Manner
…
3
“The girl chases the boy”
A0
pred
A1
BabySRL [Connor et al. 2008; 2010]
BabySRL corpus
Adam, Eve, Sarah [Brown, 1973]
“The girl chases the boy”
A0
pred
A1
4
BabySRL [Connor et al. 2008; 2010]
BabySRL corpus
Adam, Eve, Sarah [Brown, 1973]
Adult utterances (cleaned up)
“The girl chases the boy”
A0
pred
A1
4
BabySRL [Connor et al. 2008; 2010]
BabySRL corpus
Adam, Eve, Sarah [Brown, 1973]
Adult utterances (cleaned up)
Focus on verb predicates
“The girl chases the boy”
A0
pred
A1
4
BabySRL [Connor et al. 2008; 2010]
BabySRL corpus
Adam, Eve, Sarah [Brown, 1973]
Adult utterances (cleaned up)
Focus on verb predicates
1 verb 2 args (24% of sent.)
“The girl chases the boy”
A0
pred
A1
4
Experiment 1: Supervised learning
Given perfect feedback, do simple, bottom-level
features capture anything useful about semantic
roles/verb preferences?
5
Experiment 1: Supervised learning
•
Supervised classifier (average perceptron)
•
6
LBJava [Rizzolo and Roth, 2010]
•
Train on BabySRL corpus
•
Test on novel verb sentences
Experiment 1: Supervised learning
•
Supervised classifier (average perceptron)
•
LBJava [Rizzolo and Roth, 2010]
•
Train on BabySRL corpus
•
Test on novel verb sentences
Intransitive: “The bunny krads”
Transitive: “The boy krads the girl”
Ditransitive: “The girl krads the boy a bunny”
6
Experiment 1: Features
•
7
Most frequent label
A0
A1
The girl chases the boy
Experiment 1: Features
•
•
7
Most frequent label
Lexical features
A0
A1
The girl chases the boy
chase-girl
chase-boy
Experiment 1: Features
•
7
Most frequent label
•
Lexical features
•
Noun Pattern
A0
A1
The girl chases the boy
chase-girl
chase-boy
1st of 2
2nd of 2
Experiment 1: Features
•
7
Most frequent label
A0
A1
The girl chases the boy
chase-girl
chase-boy
Noun Pattern
1st of 2
2nd of 2
Verb Position
Before
After
•
Lexical features
•
•
Experiment 1: Results
100
A1A1
A0A0
A0A2
A0A1
75
50
25
0
Most Freq.
8
Lex
+NPat
+VPos
+NPat & VPos
Experiment 1: Results
100
A1A1
A0A0
A0A2
A0A1
75
50
25
0
Most Freq.
8
Lex
+NPat
+VPos
+NPat & VPos
Experiment 1: Results
100
A1A1
A0A0
A0A2
A0A1
75
50
25
0
Most Freq.
8
Lex
+NPat
+VPos
+NPat & VPos
Experiment 1: Results
100
A1A1
A0A0
A0A2
A0A1
75
50
25
0
Most Freq.
8
Lex
+NPat
+VPos
+NPat & VPos
Experiment 1: Results
100
A1A1
A0A0
A0A2
A0A1
75
50
25
0
Most Freq.
8
Lex
+NPat
+VPos
+NPat & VPos
Experiment 1: Results
100
A1A1
A0A0
A0A2
A0A1
75
Predicate
knowledge
50
25
0
Most Freq.
8
Lex
+NPat
+VPos
+NPat & VPos
Multiple predicates
“Remember how we play the surprise game?”
A1
“Remember how we play the surprise game?”
A0
9
A1
Multiple predicates
“Remember how we play the surprise game?”
A1
“Remember how we play the surprise game?”
A0
9
# sent
%
1 verb
10,356
69.86
2 verbs
3,614
24.38
A1
Effect of multiple predicates
(Noun Pattern)
100
A2
A1
A0
75
50
25
0
all
10
first
last
Effect of multiple predicates
(Noun Pattern)
100
A2
A1
A0
75
50
25
0
all
10
first
last
Effect of multiple predicates
(Noun Pattern)
100
A2
A1
A0
75
50
25
0
all
10
first
last
Effect of multiple predicates
(Noun Pattern)
100
A2
A1
A0
75
50
Syntactic surface the same
e.g. NPat: 4th out of 5
25
0
all
10
first
last
Experiment 1: Supervised learning
Given perfect feedback, do simple, bottom-level
features capture anything useful about semantic
roles/verb preferences?
11
Experiment 1: Supervised learning
Given perfect feedback, do simple, bottom-level
features capture anything useful about semantic
roles/verb preferences?
Yes, but predicate knowledge is crucial
11
Experiment 2: Unsupervised learning
Can we predict arguments/predicates using
distributional clusters and a few seed nouns?
12
Experiment 2: Unsupervised learning
Can we predict arguments/predicates using
distributional clusters and a few seed nouns?
Syntactic Bootstrapping via Structure-Mapping
[Gleitman, 1990; Fisher et al. 2010]
12
Experiment 2: Unsupervised learning
•
HMM over 2.2M tokens (CHILDES)
•
80 induced clusters, list of function words
•
List of seed nouns [Dale and Fenson, 1996]
•
Noun identification
“Cluster contains more than k seed nouns”
13
Experiment 2: Verb Identification
She
HMM
45
N Ident. N
Funct.
14
krads
51
a
19
F
red
60
truck
73
N
Experiment 2: Verb Identification
She
HMM
45
N Ident. N
Funct.
15
krads
51
a
19
F
red
60
truck
73
N
Experiment 2: Verb Identification
She
HMM
45
N Ident. N
Funct.
krads
51
a
19
red
60
F
60
0 args
1 arg
2 args
3 args
45
30
15
0
51
15
truck
73
N
60
Experiment 2: Verb Identification
She
HMM
45
N Ident. N
Funct.
krads
51
a
19
red
60
F
60
0 args
1 arg
2 args
3 args
45
30
15
0
51
15
truck
73
N
60
Experiment 2: Results
1
0.75
0.5
0.25
arg-F
verb-F
verbRand-F
0
1
16
25
49
73
Experiment 2: Results
1
0.75
0.5
0.25
arg-F
verb-F
verbRand-F
0
1
16
25
49
73
Experiment 2: Results
1
0.75
0.5
0.25
arg-F
verb-F
verbRand-F
0
1
16
25
49
73
Experiment 2: Parameters
17
•
Random/frequent seed noun selection
•
Variants + plurals of seed nouns
•
Verb/predicate evaluation
•
Multiple predicates
•
Seed noun threshold k
•
Null predictions
•
Function words
Experiment 2: Parameters
•
Random/frequent seed noun selection
•
Variants + plurals of seed nouns
1
•
verb
verb FREQ
verbRand
verbRand FREQ
Verb/predicate evaluation
0.8
•
Multiple predicates
•
Seed noun threshold k
0.4
•
Null predictions
0.2
•
18
Function words
0.6
0
1
25
49
73
Experiment 2: Parameters
•
Random/frequent seed noun selection
•
Variants + plurals of seed nouns
1
•
verb
verb FREQ
verbRand
verbRand FREQ
Verb/predicate evaluation
0.8
•
Multiple predicates
•
Seed noun threshold k
0.4
•
Null predictions
0.2
•
18
Function words
0.6
0
1
25
49
73
Experiment 2: Parameters
•
Random/frequent seed noun selection
•
Variants + plurals of seed nouns
1
•
verb
verbRand
@24 seed nouns
Verb/predicate evaluation
0.9
•
19
Multiple predicates
0.8
•
Seed noun threshold k
•
Null predictions
0.6
•
Function words
0.5
0.7
Freq
Freq + Var
Experiment 2: Parameters
•
Random/frequent seed noun selection
•
Variants + plurals of seed nouns
1
•
verb
verbRand
@24 seed nouns
Verb/predicate evaluation
0.9
•
19
Multiple predicates
0.8
•
Seed noun threshold k
•
Null predictions
0.6
•
Function words
0.5
0.7
Freq
Freq + Var
Experiment 2: Parameters
•
Random/frequent seed noun selection
•
Variants + plurals of seed nouns
1
•
verb
verbRand
@24 seed nouns
Verb/predicate evaluation
0.9
•
20
Multiple predicates
0.8
•
Seed noun threshold k
•
Null predictions
0.6
•
Function words
0.5
0.7
Relaxed
Strict (all)
Strict (first)
Experiment 2: Parameters
•
Random/frequent seed noun selection
•
Variants + plurals of seed nouns
1
•
verb
verbRand
@24 seed nouns
Verb/predicate evaluation
0.9
•
20
Multiple predicates
0.8
•
Seed noun threshold k
•
Null predictions
0.6
•
Function words
0.5
0.7
Relaxed
Strict (all)
Strict (first)
Experiment 2: Parameters
•
Random/frequent seed noun selection
•
Variants + plurals of seed nouns
1
•
verb
verbRand
@24 seed nouns
Verb/predicate evaluation
0.9
•
20
Multiple predicates
0.8
•
Seed noun threshold k
•
Null predictions
0.6
•
Function words
0.5
0.7
Relaxed
Strict (all)
Strict (first)
Experiment 2: Unsupervised learning
Can we predict arguments/predicates using
distributional clusters and a few seed nouns?
21
Experiment 2: Unsupervised learning
Can we predict arguments/predicates using
distributional clusters and a few seed nouns?
Yes, with as few as 24 seed nouns
21
Experiment 2: Unsupervised learning
Can we predict arguments/predicates using
distributional clusters and a few seed nouns?
Yes, with as few as 24 seed nouns
ne
21
s
e
t
a
c
i
d
e
r
p
e
l
p
i
t
l
u
m
r
e
d
i
s
n
o
c
ed to
Conclusions
•
BabySRL model of language acquisition
•
•
22
Evidence for syntactic bootstrapping
Exploration of assumptions
•
Data representation
•
Evaluation
•
Psycholinguistic validity
Future Directions
•
BabySRL from scratch [Connor et al. 2012]
•
Beyond single predicates
•
23
•
Multiple verbs
•
Prepositions
Relaxing perfect feedback (scene ambiguity)
•
Superset
•
Bootstrapped Animacy
Future Directions
•
BabySRL from scratch [Connor et al. 2012]
•
Beyond single predicates
•
23
•
Multiple verbs
•
Prepositions
Thanks
Relaxing perfect feedback (scene ambiguity)
•
Superset
•
Bootstrapped Animacy