 
        Towards Events Annotated Corpus of Polish
Michał Marcińczuk, Marcin Oleksy, Jan Kocoń,
Tomasz Bernaś and Michał Wolski
{michal.marcinczuk, marcin.oleksy, jan.kocon,
tomasz.bernas, michal.wolski}@pwr.edu.pl
Institute of Informatics
Wrocław University of Technology
Wybrzeże Wyspiańskiego 27,
Wrocław, Poland
April 10, 2015
Work financed as part of the investment in the CLARIN-PL research infrastructure
funded by the Polish Ministry of Science and Higher Education.
Introduction » Event recognition
Event recognition
Part of Natural Language Engineering
Major task in Information Extraction (IE) field
The goal is to identify actions and some states described in
text:
M. Marcińczuk, et al.
Textual evidence of an event
Event arguments (who? when? where? ...)
Event attributes (specific/generic, true/false, past/present ...)
Structural representation of events ⇒ data mining
April 10, 2015
2 / 15
Introduction » Example
Example (1/2)
Text
Two Russians and a Frenchman left the Mir and endured a
rough landing on the snow-covered plains of Central Asia on
Thursday. (...) The two Russians arrived on the Mir last
August (...). Solovyou celebrated his 50th birthday during
his six-month space voyage.
source: http://www.themoscowtimes.com/
What to annotate:
Temporal expressions
Events
Signals
Links
M. Marcińczuk, et al.
April 10, 2015
3 / 15
Introduction » Example
Example (2/2)
Rob O’Neil - weteran NAVY Seals, który zastrzelił w maju
2011 roku Osamę bin Ladena, po 16 latach służby odszedł z
jednostki i ujawnił swoją tożsamość. Nazwisko mężczyzny
wyszło na jaw po tym, jak amerykańska stacja informacyjna
Fox News poinformowała, że żołnierz udzieli w niej wywiadu
i opowie o całej akcji wymierzonej w szefa Al-Kaidy. Jak
mówi jego ojciec, nie boją się zemsty ze strony Państwa
Islamskiego, ani innych organizacji terrorystycznych.
M. Marcińczuk, et al.
April 10, 2015
4 / 15
Introduction » Example
Example (2/2) - timeline
maju 2011 roku
zastrzelił
odszedł
ujawnił
poinformowała
wyszło (na jaw)
mówi
udzieli
opowie
(nie) boimy się
akcji
zemsty
służby
16 lat
Rob O’Neil
Fox News
ojciec
M. Marcińczuk, et al.
April 10, 2015
5 / 15
Events in TimeML » What to annotate?
What to annotate?
Tensed Verbs: A fresh flow of lava, gas and debris erupted there
Saturday.
Untensed verbs: Prime Minister Benjamin Netanyahu called the
prime minister of the Netherlands to thank him for
thousands of gas masks (...).
Nominalizations: Israel will ask the US to delay a military strike
against Iraq until the Jewish state is fully prepared for
a possible Iraqi attack.
Adjectives: A Philippine volcano, dormant for six centuries,
began exploding with searing gases, thick ash and
deadly debris.
Prepositional phrases: All 75 people on board the Aeroflot Airbus
died.
Predicative Clauses: "There is no reason why we would not be
prepared," Mordechai told the Yediot Ahronot daily.
M. Marcińczuk, et al.
April 10, 2015
6 / 15
Events in TimeML » Classes
Classes
REPORTING: say, report, announce, ...
PERCEPTION: see, hear, watch, feel, ...
ASPECTUAL: begin, start, finish, stop, continue, ...
I_ACTION: attempt, try, promise, offer, regret, ...
I_STATE: believe, want, wish, ...
STATE: be on board, kidnapped, recovering, love, ...
OCCURRENCE: die, crash, build, merge, sell, take advantage of, ...
M. Marcińczuk, et al.
April 10, 2015
7 / 15
Event annotation » Textual mentions
Textual mentions (1/2)
Step 1 — Annotation of textual mentions of events
Event X
Event Y
Action
dynamic
State
static
Reporting
Verb
We know that X
occured or not
Auxiliary
verbs
Perception
D
A
Ascpectual
I_Action
B
We do not know
if X occured or not
I_State
M. Marcińczuk, et al.
Light_predicate
April 10, 2015
C
8 / 15
Event annotation » Textual mentions
Textual mentions (2/2)
Action
pracy
spotkania
spotkanie
zginęło
odbędzie
I_Action
zapowiedział
pozwala
zgodę
wymaga
proszę
53
43
40
35
35
13
10
9
9
8
State
ma
93
mają
47
mieć
23
miał
23
oznacza 21
I_State
można
163
może
120
ma
45
trzeba
42
należy
39
Reporting
powiedział
41
mówi
26
stwierdził
13
mówił
13
informuje
9
Light_predicate
doszło
5
ma
5
ulec
5
dokonał
4
prowadzić
3
Perception
widać
18
zobacz
18
zobaczyć
12
widzę
7
posłuchać 5
Aspectual
zakończył 13
zaczęła
12
zaczyna
10
rozpoczął 9
końcowa
8
Table 1: Top 5 mentions (ortographic forms) for each category.
M. Marcińczuk, et al.
April 10, 2015
9 / 15
What has been done? » Event mentions
Event mentions in KPWr
Documents in KPWr
Documents annotated
Annotations (unique)
Annotations (total)
0
M. Marcińczuk, et al.
2000
4000
6000
8000
10000
1634
558
9557
24023
12000
14000
Action
Aspectual
I_Action
I_State
Light_predicate
Perception
Reporting
State
April 10, 2015
10 / 15
What has been done? » Annotation agreement
Annotation agreement
Positive specific agreement between two annotators (A and B) for
100 documents from KPWr.
M. Marcińczuk, et al.
Events (only spans)
Events
Action
Aspectual
Perception
Reporting
I_Action
I_State
State
Light_predicate
3184
2561
2085
46
20
39
23
115
213
20
393
1016
766
4
2
29
19
61
92
41
April 10, 2015
664
1287
418
10
37
28
21
70
634
20
85.76%
68.98%
77.89%
86.79%
50.63%
57.78%
53.49%
63.71%
36.98%
39.60%
11 / 15
What is to be done? » Event arguments
Arguments
Step 2 — Linking event mentions with generic arguments
agency — who performed the action,
temporal — when the action was performed and how long the
action was being performed or how long the state was
present,
spatial — where the action was performed.
M. Marcińczuk, et al.
April 10, 2015
12 / 15
What is to be done? » Event attributes
Attributes
Step 3 — described event attributes
generality — specific or general,
tense — past, present or future,
polarity — affirmative or negative,
M. Marcińczuk, et al.
April 10, 2015
13 / 15
What is to be done? » Event linking
Event linking
Step 4 — event linking
subordination link — relations between events (modal, factive,
counter-factive, evidential, negative evidential,
conditional)
aspectual link — relation between an aspectual event and its
argument event.
M. Marcińczuk, et al.
April 10, 2015
14 / 15
»
The End
M. Marcińczuk, et al.
Thank you for your attention.
April 10, 2015
15 / 15
				
											        © Copyright 2025