Seminar Topics Distribution & How to Find Literature Wiltrud Kessler Summer 2012

Seminar Topics Distribution
& How to Find Literature
Wiltrud Kessler
Institut f¨
ur Maschinelle Sprachverarbeitung
Universit¨
at Stuttgart
Summer 2012
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Outline
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
2 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Outline
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
3 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
The Plan for Today
1. All topics will be presented (in a very short way).
2. Distribution of topics
(∀x∃y : student(x) ∧ topic(y ) → hasTopic(x, y ) ∧ (6 ∃z :
hasTopic(a, z) ∧ hasTopic(b, z) ∧ a = b)).
3. Distribution of presentation dates:
No Schein Dates in May / June preferred.
Schein Dates in June / July preferred.
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
4 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Course Evaluation
I
To get credit for this class, you need to chose a topic and
present it in class.
I
The oral presentation should be around 30–45 minutes with a
discussion afterwards (25 % of grade).
I
The written report should contain about X1 pages, a template
will be provided (50 % of grade).
I
You need to participate actively in class, this includes reading
all other papers (25 % of grade).
For changes or further doubts please look at the evaluation page
linked from the homepage.
1
depending on program
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
5 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Oral Presentation
Length 30–45 minutes + discussion (shorter if no Schein).
Template No template, it is not mandatory to have slides, feel
free to use the blackboard or handouts. For slides
LATEX Beamer2 is recommended.
Submission Deadline is date of presentation, it is recommended
to get feedback beforehand.
Submissions will be managed in ILIAS.
Format PDF.
All seminar participants will read the main paper before the talk.
Questions will be clarified in class.
2
https://bitbucket.org/rivanvx/beamer/wiki/Home – these slides have
been created with LATEX Beamer.
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
6 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Written Report (1)
Length See next slide.
Template Two-column conference paper format, a template will
be available from the homepage for Word and LATEX.
Submission A preliminary report has to be handed in a week
before the talk.
The final report has to be handed in a week after
the talk.
Submissions will be managed in ILIAS.
Format Original document (DOC / TEX + figures) and PDF.
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
7 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Written Report (2)
B.Sc. 10–12 pages, focus on main paper.
The final grade for the module will be an average of
the grades in “Sentiment Analysis” and “Natural
Language Generation”.
Inf/M.Sc. elective 3 ECTS 10–12 pages, focus on main paper.
M.Sc. elective 6 ECTS 15–20 pages, discuss related work, search
for more literature, comment work in detail.
M.Sc. concentration Please forget what I said last week!
5–10 pages, only main paper → Vorleistung (no
grade). The questions in the oral exam will
determine the grade for the concentration.
No Schein No report.
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
8 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Outline
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
9 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Topic 1: Subjectivity Classification
Subjective statements refer to the internal state of mind of a
person and cannot be observed. In contrast, objective statements
can be verified by observing and checking reality. It is sometimes
useful for a sentiment analysis system to filter out objective
language and predict sentiment based on subjective language only.
Unfortunately, detecting subjectivity is also a complicated problem.
References: [RW03], [WWH05]
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
10 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Topic 2: Subjectivity Word Sense Disambiguation
Sentiment analysis often uses dictionaries that list the polarity of
each word. However, many words have both subjective and
objective senses. Subjective words used in an objective sense are a
significant source of error in sentiment classification. Subjectivity
word sense disambiguation tries to automatically determine which
word instances in a corpus are being used with objective senses.
References: [WM06], [AWM09], [AWCM11]
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
11 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Topic 3: Polarity Reversers
To determine the polarity of an expression with only a lexicon of
positive and negative words is often not sufficient, because many
phenomena can influence the polarity. The most obvious example
for such influences are “polarity reversers”, words that reverse the
polarity of a sentiment word, e.g., “no” or “not”. An approach to
tackle this problem is to assume the polarity of a word is known
and classify each sentiment word as reversed or non-reversed
according to its context.
References: [ITRO08], [CC08], [WBRK10]
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
12 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Topic 4: Conditional Sentences
Conditional sentences are sentences that describe implications or
hypothtical situations and their consequences. Some conditional
sentences directly express sentiment on a product, but many of
them express a hypothetical situation, a wish or a general
implication.
References: [NLC09]
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
13 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Topic 5: Comparative Sentences
A common way to express opinions is by comparing one entity with
a different entity. There are different types of comparisons, direct
comparisons of two entities, a comparison of the entity to a general
standard and superlatives that set one entity above all others in
the comparison set. Simply detecting comparative adverbs or
adjectives is not sufficient, because it is possible for a sentence to
contains a comparative word, although it is not a comparative
sentence (”couldn’t agree with you more”) while on the other hand
a comparative sentence does not necessarily have to include any
comparative word (”no joy stick unlike the sony ericsson t60”).
References: [JL06a], [JL06b], [GL08]
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
14 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Topic 6: Topic Models
These papers present a framework for extracting ratable aspects of
objects from online user reviews. A statistical model is used to
discover topics in text and extract text snippets supporting the
ratings of different aspects.
References: [TM08b], [TM08a]
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
15 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Topic 7: Linguistic Features
Many classifiers for the classification of sentiment polarity use only
shallow features like bag-of-words. To enhance the accuracy of
sentiment polarity classification, several features based on linguistic
analysis and syntactic structures have been proposed.
References: [DLP03], [Gam04], [MTO05]
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
16 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Topic 8: Opinion Spam
The term “opinion spam” refers to fictive reviews that have been
written to mislead humans or automatic systems in their evaluation
of the opinions about a product or a service. Fictive positive
reviews are written to artificially improve the perceived opinion of a
product or a service, fictive negative reviews are written to damage
the reputation of a competitor or its products.
References: [JL07], [JL08], [OCCH11]
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
17 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Themenverteilung
Student
Topic
Date
Program
Topic 1 (Subjectivity)
Topic 2 (Word Sense)
Topic 3 (Reversers)
Topic 4 (Conditionals)
Topic 5 (Comparatives)
Topic 6 (Topic Models)
Topic 7 (Features)
Topic 8 (Spam)
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
18 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Outline
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
19 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
How do I Find Literature?
I
Search for keywords.
I
Follow references in papers that you already have.
I
Look for papers that cite a paper that you already have
(citation indexes).
I
References (!!) from Wikipedia.
I
Look for more papers from the same author (the last name in
the list is usually the Professor).
I
It is not useful to only collect literature, you also have to read
it. If there are too many papers to read all of them, it may be
helpful to restrict the search to good conferences3 or journals4 .
I
Wikipedia or other webpages are not trustworthy sources.
3
4
ACL, COLING, EMNLP, NAACL, EACL
Computational Linguistics
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
20 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
How do I get the Literature?
I
Offline (books etc.)
I
I
I
Unibibliothek, Landesbibliothek, Fernleihe.
Verzeichnis lieferbarer B¨
ucher (VLB).
Online
I
I
I
I
I
Wiltrud Kessler
Scientific search engine: Google Scholar, CiteSeer, DBLP, . . .
Citation indexes: Google Scholar, CiteSeer, Scirus, (ACM),
(IEEE), . . .
Digital publication lists (IEEE, ACM, Springer, ...) – often the
publications there are only available to subscribers, the
university library has some of these subscriptions (e.g. IEEE,
ACM).
Elektronische Zeitschriftendatenbank (EZB).
Web pages of conferences or authors.
Seminar Topics Distribution & How to Find Literature
21 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
How do I use Literature? – Citations/References
I
Whenever something is not your own idea, you need to specify
where you got that idea from (6= Guttenberg).
I
This is done with a citation.
I
A citation is a reference (a shorthand name) to some source.
There are different types of citatations:
I
I
I
I
Wiltrud Kessler
Verbatim citation (rare in NLP literature):
”Sentiment analysis is the task of identifying positive and
negative opinions, emotions, and evaluations.” [WWH05]
Author(s) of the paper as part of the sentence:
[MTO05] use dependency sub-tree patterns for sentiment
classification.
Author(s) of the paper at the end of the sentence:
Content words as well as function words have been used as
polarity reversers [CC08].
Seminar Topics Distribution & How to Find Literature
22 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
How do I use Literature? – Bibliography
I
All the sources cited in your text must be listed at the end of
your report in the bibliography.
I
The bibliography contains the citation shorthands used for
reference in the report and the corresponding “long entry”,
usually sorted alphabetically by first author.
I
The bibliography entry must contain all information needed to
enable your reader to find your source again.
I
Main types of sources for us are conference papers, journal
papers and books5 .
I
Necessary information: Author(s), title, year, conference
proceedings OR journal name, pages (papers), publisher
(books).
5
again, no web pages or Wikipedia.
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
23 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
How do I use Literature? – Citation Styles
I
There are different citation styles.
I
Computer science people often cite using numbers (e.g. [123])
or using authors’ first letters and years (e.g. [CC08]).
NLP people often use complete names and years, e.g.
I
I
I
Matsumoto et al. (2005) use dependency sub-tree patterns for
sentiment classification.
Content words as well as function words have been used as
polarity reversers (Choi and Cardie, 2008).
I
It doesn’t matter which one you use as long as it is possible to
find the citation in the bibliography.
I
Do not write the citations and the bibliography by hand, use
automation – for LATEX there is bibtex, for Word/OpenOffice
there are ways of referencing items in a text.
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
24 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Outline
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
25 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Bibliography I
[AWCM11] Cem Akkaya, Janyce Wiebe, Alexander Conrad, and Rada
Mihalcea.
Improving the impact of subjectivity word sense disambiguation on
contextual opinion analysis.
In Proceedings of CoNLL ’11, pages 87–96, 2011.
[AWM09]
Cem Akkaya, Janyce Wiebe, and Rada Mihalcea.
Subjectivity word sense disambiguation.
In Proceedings of EMNLP ’09, pages 190–199, 2009.
[CC08]
Yejin Choi and Claire Cardie.
Learning with compositional semantics as structural inference for
subsentential sentiment analysis.
In Proceedings of EMNLP ’08, pages 793–801, 2008.
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
26 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Bibliography II
[DLP03]
Kushal Dave, Steve Lawrence, and David M. Pennock.
Mining the peanut gallery: opinion extraction and semantic
classification of product reviews.
In Proceedings of WWW ’03, pages 519–528, 2003.
[Gam04]
Michael Gamon.
Sentiment classification on customer feedback data: noisy data,
large feature vectors, and the role of linguistic analysis.
In Proceedings of COLING ’04, 2004.
[GL08]
Murthy Ganapathibhotla and Bing Liu.
Mining opinions in comparative sentences.
In Proceedings of COLING ’08, pages 241–248, 2008.
[ITRO08]
Daisuke Ikeda, Hiroya Takamura, Lev-Arie Ratinov, and Manabu
Okumura.
Learning to shift the polarity of words for sentiment classification.
In Proceedings of IJCNLP ’08, pages 50–57, 2008.
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
27 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Bibliography III
[JL06a]
Nitin Jindal and Bing Liu.
Identifying comparative sentences in text documents.
In Proceedings of SIGIR ’06, pages 244–251, 2006.
[JL06b]
Nitin Jindal and Bing Liu.
Mining comparative sentences and relations.
In Proceedings of AAAI ’06, pages 1331–1336, 2006.
[JL07]
Nitin Jindal and Bing Liu.
Analyzing and detecting review spam.
In Proceedings of ICDM ’07., pages 547–552, 2007.
[JL08]
Nitin Jindal and Bing Liu.
Opinion spam and analysis.
In Proceedings of WSDM ’08, pages 219–230, 2008.
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
28 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Bibliography IV
[MTO05]
Shotaro Matsumoto, Hiroya Takamura, and Manabu Okumura.
Sentiment classification using word sub-sequences and dependency
sub-trees.
In Proceedings of PAKDD ’05, pages 21–32. 2005.
[NLC09]
Ramanathan Narayanan, Bing Liu, and Alok Choudhary.
Sentiment analysis of conditional sentences.
In Proceedings of EMNLP ’09, pages 180–189, 2009.
[OCCH11] Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T. Hancock.
Finding deceptive opinion spam by any stretch of the imagination.
In Proceedings of HLT ’11, pages 309–319, 2011.
[RW03]
Wiltrud Kessler
Ellen Riloff and Janyce Wiebe.
Learning extraction patterns for subjective expressions.
In Proceedings of EMNLP ’03, pages 105–112, 2003.
Seminar Topics Distribution & How to Find Literature
29 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Bibliography V
[TM08a]
Ivan Titov and Ryan McDonald.
A joint model of text and aspect ratings for sentiment
summarization.
In Proceedings of ACL ’08, pages 308–316, 2008.
[TM08b]
Ivan Titov and Ryan McDonald.
Modeling online reviews with multi-grain topic models.
In Proceedings of WWW ’08, pages 111–120, 2008.
[WBRK10] Michael Wiegand, Alexandra Balahur, Benjamin Roth, and
Dietrich Klakow.
A survey on the role of negation in sentiment analysis.
In Proceedings of NeSp-NLP ’10, pages 60–68, 2010.
[WM06]
Wiltrud Kessler
Janyce Wiebe and Rada Mihalcea.
Word sense and subjectivity.
In Proceedings of ACL ’06, pages 1065–1072, 2006.
Seminar Topics Distribution & How to Find Literature
30 / 31
Organizational Stuff
Topics
Finding and Citing Literature
Bibliography
Bibliography VI
[WWH05]
Theresa Wilson, Janyce Wiebe, and Paul Hoffmann.
Recognizing contextual polarity in phrase-level sentiment analysis.
In Proceedings of HLT ’05, pages 347–354, 2005.
Wiltrud Kessler
Seminar Topics Distribution & How to Find Literature
31 / 31