How to do Research in Computer Science 5 bits of advice

5 bits of advice
Five bits of advice
How to do Research in
Computer Science
1. Find a good topic
Peter Eades
3. Give lots of good talks
2. Use a good research method
4. Write lots of good papers
5. Maintain good research ethics
1
5 bits of advice
2
Two extreme topics
Find a good topic
Five bits of advice
Keywords in
Armenian
1. Find a good topic
Independent Ira
2. Use a good research method
I have always thought
that programming
languages which use
keywords in Armenian
lead to more productive
software engineering.
I want to prove it.
3. Give lots of good talks
4. Write lots of good papers
5. Maintain good research ethics
Phylogenia of π-systems:
the case k=4
Team member Terri
My professor wrote the
first π-system, and for
the past 17 years has
been studying the
phylogenia of such
systems.
Three other people in
my laboratory are
studying k=1, k=2, and
k=3; I will study k=4.
3
4
Find a good topic
Find a good topic
Two extreme topics
Independent Ira: has an
idea, and wants to
pursue it, even alone.
Team member Terri:
adds a bit to a long
term team project
ƒ Dangerous topic
• may lead nowhere
• may be uncompetitive
ƒ Can be satisfying for
some people
ƒ Funding unlikely
ƒ Safe topic
ƒ Can be satisfying for
some people
ƒ More chance of
funding
5
Independent topic
Part of a team
Advantages
• More exciting for some
people
Advantages
• Better support from
colleagues
• Good chance of funding
Disadvantages
• Funding unlikely
• Hard to publish
Disadvantages
• Can be boring for some
people
6
Find a good topic
Two extreme topics
Irene the introvert
2231-1 is a prime
number
Independent
Part of a team
My advice
Most people do better in
Most IT research
somewhere in
a team. Aisfew
the middle;
other sciences
tend
personalities
are
to be more
oriented
suitedteam
to independent
topics.
This problem
has been
bothering me for
decades. I can’t
rest until I know
the answer.
Find a good topic
Eddie the extravert
2231-1 is a prime
number
A guy in a software security
company has been
phoning me to ask about
this “possibly prime”
number, 2231-1.
I’ll try to solve the problem.
7
8
Find a good topic
What is a customer?
Two extreme topics
Irene the introvert: selfmotivated, wants to
find out for her own
sake.
Eddie the extravert: Has
a customer who wants
to know, he will try to
find out
There is no customer
Customer oriented
ƒ A customer may be
¾ An industrial partner
¾ A social community
¾ A separate community of academic researchers
ƒ A customer wants to know the answer to your
research problem
¾ Because he/she is curious, or
¾ Because he/she will make money from it, or
¾ Because it will help the his/her research, or
¾ . . … … …..
9
10
Find a good topic
What is a customer?
ƒ A customer is someone outside your academic group
¾ Not your supervisor
¾ Not people that you meet at the annual conference
for your research area
ƒ Customers can provide
¾ Feedback, plus
¾ Inspiration and/or Specification
¾ Maybe some coffee, plus
¾ Maybe some funding
Introverted research
Customer-oriented
research
Advantages
• More exciting for some
people
Advantages
• Good chance of good
feedback
• Good chance of funding
• Better scientific criticism
• Better grounded in reality
• New problems arise
Disadvantages
• Funding unlikely
• May be worthless to
everyone except yourself
• May be hard to get good
feedback
11
Disadvantages
• none
12
Find a good topic
Introspective
Two more extreme topics
Narrow Nancy
The effect of the use of
critical path planning
in managing software
projects
Customer-oriented
Narrow and deep: An
investigation of a few
variable parameters,
with many parameters
held fixed.
My advice
Always ensure that your
research has a customer
Find a good topic
Broad Betty
How to manage
software
projects
Wide and shallow:
Considers many
parameters at
once.
13
Narrow and broad topics
Narrow Nancy
Assume
• an OO design
method
• Java
• small teams
• 10K – 100K SLOC
Investigate effect of
• use of critical path
planning
14
Find a good topic
Broad Betty
Investigate the effects
of
• 15 different design
methodologies
• 7 different
programming
languages
• Small – huge SLOC
• 17 different planning
methods
Find a good topic
Narrow and deep topic
Wide and shallow topic
Advantages
• More chance of pushing
the boundary of knowledge
• More exciting
Advantages
• Realistic
• Good training for industrial
research
Disadvantages
• Your “model” may be too
abstract and unrealistic
• It’s hard to choose the
variable parameters and
the fixed parameters
Disadvantages
• Can be boring, like a
collection of undergraduate
projects
• Unlikely to contribute a lot
to the state of knowledge
15
Find a good topic
Narrow
16
Another two extreme topics
Fred the fundamentalist
Robustness
theorems for nonpre-emptive
scheduling methods
Wide
My advice
• Choose a narrow and deep
topic
• Choose your variable and fixed
parameters very carefully.
Fundamental topic:
• abstraction of specific
hardware and software
17
Find a good topic
Andy the applicationist
Disk cache scheduling
for Gnu C++ memory
management on a
Pentium 4 processor
running Solaris
Applied topic:
• specific hardware,
• specific software
18
Find a good topic
Fundamental topic
Applied topic
Advantages
• Your papers will have a
longer life
• Your work can have more
applications
Advantages
• Easier problems
• May help with getting a job
in industry
• Can contribute a lot to a
relevant area
Disadvantages
• It’s hard to push the
boundaries very far
• Your “model” may be too
abstract and unrealistic
Disadvantages
• Your papers can die young
• Restricted applications
Find a good topic
Another two extreme topics
Classical Kirsty
P=NP
Popstar Paul
Wireless data mining for
pervasive computing in
social network censorship
I want to solve an
problem that has
defeated many others
I want a lot of
newspaper coverage
19
20
Find a good topic
Classical topic
Hot topic
Advantages
• You may solve a hard
problem
• Your papers will have a
longer life
• Better referees
• Higher scientific quality
Advantages
• Better immediate feedback
• With good timing, you can
get rich
• Easier to publish
• Easier problems
• Vibrant community
Disadvantages
• Can be frustrating
• Immediate rewards can be
small
Disadvantages
• Your papers can die young
• Scientific quality can be low
Find a good topic
Another two extreme styles
Classical hard
New hot topics
My advice
problems
Investigate a fundamental and
classical topic,
with some applications to a couple
of hot and applied topics.
There are papers all over this range,
but there is a tendency in Computer
Science to be near the hot end.
21
22
5 bits of advice
Find a good topic
Advice on topics: summary
I recommend:
¾classical topics
¾customer-oriented topics
¾fundamental, deep, and narrow topics
and
¾(perhaps shallow) applications
¾to a few hot topics.
Five bits of advice
1. Find a good topic
2. Use a good research method
3. Give lots of good talks
4. Write lots of good papers
I recommend that you obtain breadth by
being a member of a team.
5. Maintain good research ethics
23
24
Use a good research method
Use a good research method
Researchers need to
1. Create models of these problems
ƒ abstract away the non-essential details
ƒ use scientific theories and formalisms
2. Solve the model problems
ƒ Use skills in CS/Math/commonsense/…
ƒ Form solutions
3. Evaluate the solution to the model problem
ƒ Use skills in Math/Experiments/UCST
4. Present the solution to other researchers and to
the customer
5. Adjust the model according to the customer’s
evaluation, and repeat.
The research procedure
1. The customer has a problem.
2. The researcher produces an initial model of the
problem.
3. Repeat
a) The researcher solves the problem, according
to the model.
b) The researcher evaluates the solution of the
model problem.
c) The customer evaluates the solution to the
real problem.
d) The researcher adjusts the model.
Until the customer is satisfied.
25
26
Use a good research method
1.
2.
3.
Use a good research method
The customer has a problem.
The researcher produces an initial model of the
problem.
Repeat
Wetheneed
to know
a) The researcher solves
problem,
according to the model.
1. How to create/adjust a model?
b) The researcher evaluates the solution of the
2. How to find a solution?
model problem.
c) The customer evaluates
solution
to the
3.theHow
to evaluate
a solution?
real problem.
d) The researcher adjusts the the model.
Until the customer is satisfied.
Create/adjust a model
1. Creating/adjusting a model
A model is formed by forgetting some of the
parameters of the real problem; models are
simplifications of real problems.
27
28
Use a good research method
Models in IT research
• Industrial research
ƒ You cannot forget many parameters of
the real problem
ƒ Models are complex and fuzzy, maybe
not so useful
• Academic research (including PhD theses)
ƒ You can forget most of the parameters
of the real problem
ƒ Models are usually formal,
mathematical, and crisp.
2. Solutions
• Solutions are artifacts that
help the customer.
Programs
•
Protocols
To create a solution, you
need
a) Knowledge, and
b) Courage
Metaphors
Architectures
Algorithms
Theorems
....
29
30
Use a good research method
a) Knowledge
• Most IT researchers draw on a number of
fundamental skills to create a solution consisting
of a number of artifacts.
Formal logic
Problem
Compilers
OO models
Formal logic
Compilers
Program
OO models
Protocol
Solution
Concurrency
Metaphor
Algorithms
Architecture
Concurrency
Mathematics
Algorithm
Algorithms
Theorem
Mathematics
Use a good research method
a) Knowledge
Your current knowledge is probably not
enough to create a solution. You need
to increase your skill set
ƒ Remember your undergraduate
work
¾In Physics, Mathematics
¾Also Electronics, Psychology
¾Also in Anthropology, Poetry
ƒ Read books and research papers
ƒ Attend seminars and conferences
ƒ Ask experts to teach you
31
32
Use a good research method
b) Courage is very important
In IT research,
this is critical
(even in
industrial IT
research)
ƒ You need to consider wild,
weird and wonderful
possibilities
3. Evaluating a solution
To evaluate a solution, you need
a) An evaluation measure that tells
you whether the solution is good or
bad
b) An evaluation method to compute
the measure
ƒ You need to disregard
commonly accepted wisdom
and break commonly accepted
rules
33
Use a good research method
a) Evaluation measures
• There are three basic measures for the quality of a
solution:
34
Use a good research method
The three measures:
1. Effectiveness: is the solution logically correct?
2. Efficiency: does the solution use resources
efficiently?
3. Elegance: is the solution beautiful, simple, and
elegant?
Effectiveness
Measures
Elegance
All solutions can be measured in terms of these three
parameters.
Efficiency
35
36
Use a good research method
b) Evaluation methods
There are three basic evaluation methods
The three methods:
1. Mathematics: theorems, proofs
2. Experiments
ƒ Run programs on test data
ƒ Collect data about software projects
ƒ Test systems with human subjects
ƒ Needs skills in statistics
3. UCST: Try to sell your solution
Mathematics
Evaluation
methods
Experiments
Use a good research method
UCST
These are the only evaluation methods in information
technology.
And many combinations of these approaches
37
38
Example
Example
The plotter problem
A pen plotter is a calligraphic device: it has a pen
which moves over the paper to draw the picture.
Example: the plotter problem
39
40
Example
The plotter problem
A digital plotter has a pen
which can be up or down.
It accepts a sequence of
penUp/Down/moveTo
instructions.
penUp; moveTo (20,80)
penDown; moveTo (80,80)
penUp; moveTo (20,20)
penDown; moveTo (80,20)
penUp; moveTo (20,20)
penDown; moveTo (80,20)
penUp; moveTo (20,80)
penDown; moveTo (20,20)
penUp; moveTo (80,80)
penDown; moveTo (80,20)
penUp; zero
Example
The order of the moveTo
instructions has an
effect on the pen-up
time.
The plotter problem is to
order the instructions to
minimize pen-up time.
penUp; moveTo (20,20)
penDown; moveTo (20,80)
moveTo (80,80)
moveTo (80,20)
moveTo (20,20)
penUp; zero
41
42
Example
Pup =
∑ (length _ of _ lineseg )
Difficulty #1: The model is wrong
The plotter time is not proportional to distance: the
plotter accelerates to top speed, runs at top speed,
then decelerates to stop.
A specific plotter has
ƒ A top speed s.
ƒ A time t0 and distance do to reach top speed from
stationary.
ƒ A time t1 and distance d1 to slow down to stop from
top speed.
Thus if the pen travels distance d, then the time is
t0 + s(d – d0 – d1) + t1.
As long as d>d0+d1.
BUT . . .
pen −up
linesegs
= ∑ ( xstart − x finish ) 2 + ( y start − y finish ) 2
= 20 2 + 80 2 + 60 2 + 60 2 + 60 2 + 60 2 + 60 2 + 60 2 + 80 2 + 20 2
= 2 2000 + 3 7200 ≅ 357
(20,80)
(20,20)
(80,80)
(80,20)
Example
Total plotter distance with
pen up = Pup = 357cm
If speed = 5cm/sec, then
total plotter time with
pen up ≈ 71 seconds.
But . . ….
43
44
Example
Example
The model problem
BUT: The model is good enough
ƒ The times t0 and t1, and distances do and d1
are both quite small
ƒ In practice t0 + s(d – d0 – d1) + t1 ≈ sd
We have:
ƒ A set of “primitives”, where
ƒ Each primitive has a start point and a finish point.
We want:
ƒ An ordering for the primitives to minimize pen up
time.
We have a trade-off between effectiveness and
elegance: by ignoring acceleration and
deceleration we
ƒ Lose a very small amount of effectiveness, in
terms of the accuracy of the model
ƒ Gain a lot in the elegance, in terms of the
simplicity of the model
45
Example
46
Example
UCST Evaluation
The greedy solution can be “proven” effective by
UCST:
“Since it chooses the best alternative at each
stage, it gives minimum pen up time”.
One easy solution is the greedy solution:
1. Choose the first primitive so that its
start point is the closest start point to
PEN_ZERO.
2. Repeat for k=1 to NUM_PRIMS-1
Choose kth so that its start point is the
closest unused start point to the
previous finish point.
This may be convincing for some customers, but not
for good scientists.
The greedy solution can be proven elegant by UCST: it
is easy to understand, easy to implement.
47
48
Example
Example
Evaluation by mathematics
The effectiveness of the greedy solution can be
investigated using Mathematics.
Evaluation by mathematics
The effectiveness of the greedy solution can be
investigated using Mathematics.
First, it does not always give optimal results.
First, it does not always give optimal results.
0.99
1.0
Total penup distance ≈ 12.5
49
50
Example
Example
The optimal path is shorter.
Greedy: Total penup distance ≈ 12.5
GREEDY 12.5
≅
≅ 1.8
OPT
7
Optimal: Total penup distance ≈ 7
Total penup distance ≈ 7
51
52
Example
BUT: The greedy method is close to optimal:
Experimental Evaluation
Theorem
If GREEDY is the penup time with the
greedy solution and OPT is the penup
time with the optimum solution then
GREEDY / OPT = O( logn ).
Plotter
instructions
Greedy
Algorithm
Effectiveness
test
Random
Proof
Lots of proof lots of mathematical symbols more and more pages of equations and stuff
like that it goes on for 30 pages with lemmas and corollaries and lots of proof lots of
mathematical symbols more and more pages of equations and stuff like that it goes
on for 30 pages with lemmas and corollaries and Lots of proof lots of mathematical
symbols more and more pages of equations and stuff like that it goes on for 30 pages
with lemmas and corollaries and lots of proof lots of mathematical symbols more and
more pages of equations and stuff like that it goes on for 30 pages with lemmas and
corollaries and Lots of proof lots of mathematical symbols more and more pages of
equations and stuff like that it goes on for 30 pages with lemmas and corollaries and
lots of proof lots of mathematical symbols more and more pages of equations and
stuff like that it goes on for 30 pages with lemmas and corollaries and
Example
Measure Pup
Compute LBOPT
Calculate Pup/LBOPT
Customersupplied
Benchmark
53
54
Example
Example
Effectiveness test
• We want to compute Pup/OPT
as a measure of effectiveness.
• However, it is difficult to
compute OPT.
• Instead we compute LBOPT, a
lower bound for OPT (ie,
LBOPT<OPT)
• Then we have an upper bound
on effectivness, since
Pup/LBOPT > Pup/OPT
Effectiveness
test
Experiments showed that
greedy is very close to
optimal: for larger plots it
is within 10% of optimal.
120
%Pup/LBOPT
110
Measure Pup
Compute LBOPT
Calculate Pup/LBOPT
BUT . . .
100
1000 2000 3000 4000 5000 6000
55
56
Example
Example
The research procedure
Difficulty #2
We replaced the quality evaluation with a real plotter
Plotter
instructions
Greedy
Algorithm
1. The customer has a problem.
2. The researcher produces an initial model of the
problem.
3. Repeat
a) The researcher solves the problem, according
to the model.
b) The researcher evaluates the solution of the
model problem.
c) The customer evaluates the solution to the
real problem.
d) The researcher adjusts the the model.
Until the customer is satisfied.
Real
plotter
And timed the real plotter using the wall clock.
It revealed two problems:
• The model was wrong,
• The greedy algorithm was not efficient.
57
58
Example
Example
Solution #2:
Our model was wrong
At a micro-level, the plotter pen moved in three
ways:
ƒ Horizontally
ƒ Vertically
ƒ (some plotters) At 450 to horizontal
Each micro-movement takes one unit of time.
This implies that the distance function is L1 or L∞
rather than L2.
Mathematical Evaluation with the new model
• It was easy to check that the mathematical results
remain true for any distance function, and this
change in model did not change the theorems
significantly.
Theorem
If GREEDY is the penup time with the greedy solution
and OPT is the penup time with the optimum solution
then GREEDY / OPT = O( logn ).
• The higher level of abstraction implies that
mathematical methods are robust.
59
60
Example
Difficulty: Our solution was not efficient in
the customer context
Difficulty #3
Our solution was not efficient in the customer context
Plotter
instructions
Greedy
Algorithm
Example
ƒ Classical solution:
¾Use computational geometry and
clever data structures, reduce the
time complexity of the algorithm to
O(nlogn)
Real
plotter
ƒ Better solution: be more creative,
break the rules
Æ Æ
Æ
Æ
Æ
The greedy algorithm runs in time O(n2).
This was slower than the drawing procedure.
61
62
Example
Example
Solution #3
Optimize one buffer-sized section at a time.
The bufferised greedy algorithm was almost as
effective as the straight greedy algorithm, and
much faster.
plotter
Greedy
Algorithm
Buffer
Plotter
mechanics
plotter
Greedy
Algorithm
An “optimized” bufferfull is sent from the greedy
algorithm to the buffer whenever the plotter
exhausted the current buffer.
Buffer
Plotter
mechanics
63
64
Example
Example
Lessons to learn
1. The research procedure is loopy
2. You need to change your model
3. Laboratory experiments are different to
experiments in context
4. Working with a customer motivates good science
5. Breaking the rules of your research area can give
good results
6. Different evaluation techniques (maths, empirical,
UCST) have different strengths
7. A full evaluation is a combination of the three
techniques
Lessons to learn
65
Mathematics
•Robust to model changes •Does not
evaluate the
•Good evaluation of
model
pathological behavior
Experiments
•Evaluates the model
•Good evaluation of
normal behavior
•Poor evaluator
for pathological
behavior
UCST
•OK to evaluate elegance
•Poor evaluator of
efficiency /
effectiveness.
66
5 bits of advice
Use a good research method
Five bits of advice
My advice
To evaluate your solution
• Concentrate on mathematical
and experimental methods,
avoid UCST
• Relate your results to
effectiveness, efficiency and
elegance
• Try to evaluate with your
customer
1. Find a good topic
2. Use a good research method
3. Give lots of good talks
4. Write lots of good papers
5. Maintain good research ethics
67
68
Give good talks
Give good talks
Once per year
Giving a talk is beneficial to the
speaker
ƒ It helps you
¾define your problem
¾understand your own work
¾organize your ideas
¾become famous
¾write papers
ƒ It brings feedback from others
2 – 3 times
per year
Often
You can present your research
ƒ At IK-CCs (international killercompetitive conferences)
ƒ At NLCs (nice local conferences)
ƒ To research visitors to your lab
ƒ As a poster / web page
ƒ To your sister . . .
Continuously
69
70
Give good talks
Give good talks
a) Organization
How to give a talk at a conference
Giving a talk consists of three elements:
a) Organization
b) Talking and walking
c) Visuals
0
Motivation
5
Some comments about research conference
presentations Æ Æ
Æ
Æ
Æ
15
20
23
25
71
Overview
of the
research
Everyone understands
Something
difficult
Overview
Conclusion
Some understand
72
Give good talks
Example:
b) Talking and walking
ƒ Look at the audience as much as possible
¾Choose specific people to focus on
ƒ Speak slowly and clearly, and avoid idiomatic English
¾English is a second language to most people in IT
ƒ Use your hands for expression
¾avoid holding a microphone
ƒ Don’t waste time
¾Check your data-projector connection beforehand
Title: Fast spatial data mining in low dimensions
0
Data mining helps people
5
Your data mining algorithms:
• description at a high level
• no proofs, no details
Everyone
understands
Math for the 2D case
20
Chart of experimental results
23
Repeat main results
25
Some
understand
15
Give good talks
73
Give good talks
c) Visuals
ƒ Use a medium that is suitable
¾Use a computer for graphics
¾Use a blackboard for mathematics
ƒ Ensure that your visuals are perfect
¾No speeling errors
¾No spacing errors
¾Attractive layout (e.g., avoid linebreaks
as much as possible)
ƒ Don’t use visuals as notes to yourself
ƒ Avoid words; use pictures wherever possible
ƒ Avoid ducks
74
Give good talks
Look at the audience; avoid ducks
ducks
75
Give good talks
Look at the audience; use your hands
76
Give good talks
Look at the audience; avoid holding a microphone;
ensure that your slides are perfect
77
78
Give good talks
Give good talks
Avoid words; use pictures
Look at the audience
79
80
Give good talks
Give good talks
Use the slides for the audience, not as
reminders for you
More advice
• Give a practice talk to your team
• Ask people to look out for
ƒ errors and ducks in the visuals
ƒ idiomatic and ambiguous English
ƒ not looking at the audience
and write it all down, and tell you
• Video the talk, look at the video
Formal specification of Security
Protocols
•
•
•
•
The need for security
The need for formal specification
Porter and Quirk’s language
Inadequacies
81
82
5 bits of advice
Write good papers
Five bits of advice
You can write
ƒ One or two theses
ƒ Papers
¾in NLCs
¾IK-CCs
¾Journals
ƒ Chapters in books
ƒ Books
1. Find a good topic
2. Use a good research method
3. Give lots of good talks
4. Write lots of good papers
5. Maintain good research ethics
83
84
The paper writing process
Write good papers
The paper writing process
Draft a journal paper
Adjust for a conference
• Reduce the size
• Re-write introduction
Accepted?
Y
N
Write the
journal paper
properly
85
86
Write good papers
Draft a journal
paper
Conference papers
Extract a
Extract a
Extract a
paper for
paper for Note: avoid
paperrecirculation
for
conference C
conference A
conference B
Write the
journal paper
properly
87
Write good papers
88
Write good papers
There are three basic kinds of conferences
NLC
How the conference paper process works
a) You write the paper
b) You submit the paper to the program committee chair
c) The program committee chair sends it to members of
the program committee (takes about a week)
d) They read it (in about 4 weeks) and write a brief
report. They decide whether to accept your paper
e) If your paper is accepted, you revise the paper
according to the referee’s comments (2 – 4 weeks)
f) You give a talk at the conference
IK-CC
Conferences
Any many in
between
Scams
89
90
Write good papers
Write good papers
How do the program committee decide which
papers to accept?
• In most cases, the papers are scored by
members of the PC, then sorted on score.
• Very few papers get a very high score or very
low score.
• Accept/reject decisions for middle-score papers
can be fairly arbitrary
10 - 20%
Obviously
rejected
60 - 80%
random and ad-hoc decisions
Four steps for conference papers
1. Write a good paper
2. Choose a good conference, and
adjust your paper to that
conference
3. Send the paper, before the
deadline.
4. Sit around and hope that it is
accepted
10 - 20%
Obviously
accepted
91
92
Write good papers
1. Write a good conference paper
Assuming that that the page limit is 10 pages:
0
3
Motivation
and
background
Main results
8
8.5
Conclusion
Write good papers
1. Write a good conference paper
Some advice
• Keep it simple: only one main idea, two if you push it.
¾ Most PC members are busy, some are lazy
¾ They have at most 30 minutes to read your paper
¾ Abstract should be short and extremely well written
• Make sure that the presentation is perfect
¾ Grammar perfect
¾ Figures beautiful
¾ Exactly the right length
¾ Font size 11 or 12
¾ Nice layout
Everyone understands
Experts understand
Everyone understands
References
10
93
94
Write good papers
2. Choose a good conference, and adjust your paper to
that conference
Choose a conference
¾ The best possible (see the CORE ranking)
¾ A good program committee
¾ Realistic deadline
¾ Avoid “scams”
Adjust your paper
¾ Motivation aimed toward the conference community
¾ Research methods that are familiar to the conference
community
¾ Look at web pages of the program committee, and
write your paper for them to read
Write good papers
3. Send the paper before the deadline*
•
•
Maybe 5 minutes before the deadline
Maybe 1 minute before the deadline
* Some deadlines are soft, but most highrejection-rate conferences have hard
deadlines with no excuses
95
96
Write good papers
The journal paper process
4. Sit around and hope that it is accepted
•
Many many many good papers get
rejected for reasons beyond your control
•
Don’t worry if it is rejected
Write good papers
1. Revision
2. Submission
3. Refereeing
4. Published
97
Write good papers
1. Revise (from conference paper(s))
98
Write good papers
2. Submission
You submit it to an editor of a journal
a) Choose a person who is an editor and who
knows the field well
b) Choose the best journal for which your
chosen person is on the editorial board
c) Send the paper to her/him (even if the
journal’s web page instructions say to
send it to the managing editor or someone
else)
d) If she/he does not reply within 7 days, then
send a reminder
Advice
ƒ Give yourself a deadline
¾ special issues are good
ƒ Describe everything fully, in layers of detail
¾ You can delete stuff later
¾ Prove every theorem
¾ Give full literature background
¾ Give full details of experiments
99
Write good papers
Note: you
should send
a reminder
every 6
months
3. Refereeing
a) The editor sends it to referees,
with a three month deadline
b) The referees ignore it until the
deadline
c) The editor sends a reminder, and
suggests a new deadline
d) The referee reads it (takes many
hours, perhaps a few days) and
writes:
• A report
• A recommendation (accept |
revise | reject)
e) The editor sends you the reports
100
Write good papers
1. Revise
• If “accept”, then you make minor revisions
and proceed to publication
• If “revise” or “reject”, then you revise and
re-submit it
ƒ Don’t get annoyed
ƒ But don’t take “no” for an answer
ƒ You can choose a different journal but
you should assume that you will get
the same referees
ƒ Address every point made by the
referees; record how you addressed it
101
102
5 bits of advice
Write good papers
Five bits of advice
How to get your paper rejected
1. Find a good topic
If you really want your paper rejected, here are
the top methods
1. Write in bad English
2. Be unaware of current trends in the
specific conference community
3. Organize your thoughts badly
4. Omit motivation
2. Use a good research method
3. Give lots of good talks
4. Write lots of good papers
5. Maintain good research ethics
103
104
Ethics
Ethics
1. Philosophy and religion
ƒ In western countries, the dominant ethical
philosophy is John Stuart Mill’s utilitarianism:
an action X is better than an action Y if X leads
to the greater good for humans.
There are three main sources of ethics
1. Philosophy and religion
2. The law
3. Professional guidelines
Aside:
• Discussion of research ethics without
considering these sources is pointless.
• Amateur ethicists are dangerous
105
106
Ethics
Ethics
3. Professional guidelines
2. The law
ƒ The main laws relevant to computer
scientists are Intellectual Property laws.
These cover
• Patents
• Copyright
• Trade secrets
• Trademarks
ƒ These laws are very important for industrial
research.
Written guidelines on ethics are available from
ƒ Every University
ƒ Every Government research laboratory
ƒ ACM code of practice
ƒ ACS code of practice
These cover a wide range of ethical problems that
might occur in industry and research.
107
108
Ethics
The main issues
8.
9.
What is scientific misconduct?
Processes for dealing with
misconduct
Mainly form official policies
ƒ
AVCC Guidelines on
Research Practice
ƒ
University of Sydney
Guidelines
NICTA guidelines
CSIRO guidelines
All three are discussed in University policies,
procedures, and guidlines.
General principles
Data storage
Authorship
Plagiarism/Recirculation
Omission
Supervision
Conflict of interest
ƒ
ƒ
For academic research in IT there are three top ethical
issues:
a) Authorship: who should be the author of a
paper?
b) Recirculation, or self-plagiarism: writing the
same paper twice.
c) Omission: failing to say something relevant.
1.
2.
3.
4.
5.
6.
7.
10. What happens in practice?
11. Some scenarios
Summary of guidelines . .. … …. Æ
109
110
General
Data storage
2. Data storage
ƒ Data used in experiments must be stored
ƒ Must be preserve privacy as defined by an
Australian Standard
ƒ Minimum 5 years
ƒ Should be stored in the institution as well as
with the researcher
ƒ If you publish a paper based on some data, then
you should make the data available to other
researchers on demand
1. General principles
Researchers must maintain:
ƒ High Standards
ƒ Discipline-specific ethics
ƒ Workplace safety
ƒ Confidentiality (e.g., in questionnaires from humans)
Research results should be open to scrutiny by peer
review. (Secrecy is possible, but only for a limited time)
111
112
Ethics
Ethics
3. Authorship
Authorship
Authorship is substantial participation, including:
• Conception and design, or analysis of data; and
• Drafting/revising the paper; and
• Final approval of the version to be published.
An author's role must be sufficient for that person to
take public responsibility for the paper.
Authorship is not
• Helping to get funding for the project
• The collection of data
• General supervision of the research group
¾ “Honorary authorship” is not acceptable.
113
114
Authorship
Authorship
Authorship
Authorship
Co-authors
• Early in the project, you should discuss who will be an
author
• One of the co-authors should be assigned to keep
records (of experiments etc), and formally accept
responsibility for the entire paper.
• Authors should sign an authorship statement
• This statement should be kept on file in the institute
• All authors should agree to being an author
All people qualified to be authors should be
authors
• No person who is allowed to be an author
can be excluded as a author.
Non-authors who have contributed (e.g., funding) should
be acknowledged
115
116
Ethics
Publications
More guidelines about publications
• Private publication (non-reviewed) is OK, but you
should explicitly say that it has not been reviewed
• You must acknowledge the sources of financial
support (as a declaration of possible conflict of
interest)
• Publishing lies is not allowed
4. Plagiarism/Recirculation
Recirculation, or self-plagiarism
ƒ Publication of multiple papers based on the same
data is not acceptable except where there is full
cross-referencing within the papers
ƒ Before you submit two similar papers, you should
tell both editors/publishers
ƒ Always cite previous papers that you have written
with a similar theme/content
ƒ As a rule of thumb, don’t copy-and-paste anything
except some parts of the introduction
117
118
Ethics
Supervision
6. Supervision
5. Omission
ƒ You should not omit to say something
significant
¾ Obvious example: suppose that your
algorithm is derived from an algorithm by
person X. Then you should say so in
your paper.
¾ There are many more subtle examples.
Your supervisors should
• Be well qualified
• Have a reasonable staff/student ratio
• Give you ethics guidance
• Ensure (as far as possible) the validity of the
data
119
120
Conflict of interest
Conflict of interest
Conflict of interest
Refereeing
You cannot referee a paper if you have a conflict of
interest with one of the authors.
A conflict of interest defined for ACM/IEEE conferences
as any situation where you don't feel that you can
make an objective assessment, including:
ƒ you are a co-author
ƒ one of your current or former students is a coauthor
ƒ your supervisor / former supervisor is a co-author
ƒ a colleague from your current institution is a coauthor
ƒ a colleague who you have worked with on a
research project in the past 5 years is a co-author
7. Conflict of interest
Money
If you will gain financially from some research,
then you should say so in the publications
etc.
121
122
Misconduct
Scientific misconduct = “fabrication, falsification, plagiarism, or
other unacceptable practices”.
For example:
ƒ misleading ascription of authorship
ƒ listing of authors without their permission
ƒ attributing work to others who have not in fact contributed
to the research
ƒ lack of appropriate acknowledgment of work primarily
produced by someone else
8. What is scientific misconduct?
¾ According to the guidelines Æ
It does not include honest errors or honest differences in
interpretation
123
124
Misconduct
Misconduct
Sin #1: Misappropriation
Sin #2: Interference
You should not:
ƒ Plagiarize (present of the words or ideas of another as his
or her own, without reference)
ƒ Use information in breach of confidentiality associated
with the review of a manuscript or grant application
ƒ Omit reference to the relevant published work of others for
the purpose of inferring personal discovery
You should not:
ƒ damage any research-related property of another
125
126
Misconduct
Sin #3: Misrepresentation
9. What happens if someone breaks the
guidelines / code of ethical behaviour?
You should not:
ƒ Tell lies
ƒ Omit to say something significant
127
128
Processes for dealing with research misconduct
University process
1. Someone makes an allegation
2. The Deputy Vice Chancellor for Research is notified.
3. It is investigated locally, to see whether it is serious.
4. If it is serious, then the matter is referred up the chain,
to a series of committees
10. What happens in practice?
How does it work in practice? … … … Æ
129
In practice
You want:
ƒ To advertise your research as widely as possible
ƒ To write as many papers as possible
ƒ To become rich and/or famous
130
In practice
Authorship and acknowledgement
ƒ Always ensure that every author on the paper agrees
to be an author
ƒ Always be generous in inviting authorship
ƒ Don’t be insulted if someone declines to be an author
ƒ Always acknowledge generously
¾Funding sources
¾People with whom you have had significant
discussions
¾Generously reference as many relevant papers as
possible
These things may tempt you to ignore research ethics,
BUT you should resist the temptation
131
132
In practice
Multiple submission/ recirculation/ self plagiarism
ƒ It is a difficult issue
¾Multiple submission is bad
¾But two papers can be close to each other
ƒ Rule of thumb: “Introduction” may be created by
copy-paste, but no other part of the paper
should be created by copy-paste
ƒ Ask your boss/supervisor if in doubt
ƒ Multiple submission makes people mistrust you;
trust between researchers is very important
In practice
Plagiarism
ƒ Always acknowledge and reference generously
ƒ Never quote without quote marks
ƒ Never cut-and-paste from someone else’s work
ƒ If you use someone’s figures/pictures, ask their
permission first.
¾If they do not give permission, then try to use
alternative pictures
¾If they do give permission, then acknowledge
them fully in the figure caption
133
134
In practice
Omission
ƒ Omission is unfortunately common, even among
senior scientists
ƒ Very difficult to prevent
ƒ Relies mostly on peoples sense of scientific
honesty
ƒ If you keep rigorously honest, for example:
¾do not use “straw man” comparison;
¾report experiments fully, even when the
results did not go as you expected,
then people will trust you more
In practice
University processes in practice
• The sinner is warned and the warning is
stored in a file.
• Bad offences, and repeat offences, result in:
¾ Staff being fired
¾ Students being thrown out
¾ Supervisors being fired / forced to resign
(sometimes because their students were
involved in research misconduct)
¾ Legal action
135
136
Scenario: Millicent and Mutter, with Dingle
1. An honours student, Millicent, writes a brilliant thesis on
simplifying agent-oriented concept design (AOCD).
2. His supervisor, Professor Mutter, sees that it is brilliant
and turns it into a joint paper, which is accepted to the
rank A conference AOS2003.
3. Millicent gets first class honours and goes to work in
Sweden.
4. Professor Mutter presents the joint paper at AOS2003 in
Tokyo.
5. Professor Dingle from Ohio State University sends an
email to Professor Mutter pointing out that the brilliant
simplification of AOCD was all in a paper that Dingle
published in 1998. She accuses Mutter of plagiarism.
Plus discussion
3. Some scenarios
137
138
Scenario: Millicent and Mutter, with Dingle (cont.)
Scenario: Robbie and the Rapid Router
1. Robbie has a new routing algorithm RR that he thinks is
faster than previous algorithms.
2. There are some published benchmark data sets for this
kind of routing.
3. RR is a randomised algorithm (eg, genetic algorithm) that
gives a different result every time you run it.
4. Robbie runs RR on the benchmark 1000 times, and finds
that the average runtime is 231.1ms; while the maximum
runtime is 1451.7ms and the minimum is 62.1 ms.
5. The best previous result on this benchmark used 81.3ms.
6. Robbie submits a paper reporting that his new algorithm
is better than previous algorithms because it ran on the
benchmark in 62.1ms.
7. ….?
6. Mutter checks Millicent’s thesis against Dingle’s paper
and finds that large sections have been copied, word for
word; Mutter apologizes to Dingle.
7. Mutter writes to his dean and asks that Millicent’s
honours degree be rescinded.
8. The dean accuses Mutter of plagiarism.
9. The case works its way up the University disciplinary
system.
10. The university offers Mutter a choice: accept a demotion
to Associate Professor, or resign.
11. Mutter resigns.
139
Scenario: Bertie, Bogie and his wife
140
Scenario: Ellen and the middle-aged Miles
1. A PhD student, Ellen, goes to a conference, and gives a talk.
2. After the talk, a middle-aged respectable professor (called
Miles) asks Ellen lots of questions, and asks her about her
future directions.
3. Ellen tells him everything; she is very happy that Professor
Miles is interested in her work.
4. Miles is ambitious, but he has only published two papers in
the last three years.
5. Six months later, Miles publishes a paper which has all the
stuff that that he and Ellen discussed.
6. The paper has no acknowledgement to Ellen.
7. …?
1. Bertie, the departmental director of research, does not
like Associate Professor Bogie.
2. Bertie notes that Bogie has written a joint paper with his
wife, who is a student at a different University.
3. Bertie begins to look through Bogie’s many papers and
finds three papers which are almost the same. They are
published at three different conferences.
4. In the meantime, Bogie accepts a job as Professor at a
different University.
5. Bertie writes to the director of research at Bertie’s
University, pointing out that at least three papers of Bogie
are virtually the same.
6. … ?
141
Scenario: Kathleen and Mabel, with Malmsbury error detection
142
Scenario: Banbury, Brightwistle and the X-Rays
1. A PhD student, Banbury, invents a wobbly algorithm and
applies it to 1996 chest X-ray data from Wentworth.
2. He publishes the paper in WOBBLY2004, claiming that it
is better than the 2003 wobbly algorithm of Brightwhistle
and Scott.
3. Brightwhistle gets annoyed, because she thinks her
algorithm is the best.
4. Brightwhistle wants to test her 2003 algorithm on the
1996 chest X-ray data from Wentworth, and asks
Banbury for the data.
5. Banbury replies that he spent half his grant extracting the
data from the database, and if Brightwhistle wants the
data, then she can get it herself.
6. … ?
1. A PhD student, Kathleen, is writing a paper for a
conference (in Colorado) and discusses it a lot.
2. Mabel has a great idea that would fit right in Kathleen’s
paper.
3. Kathleen and Mabel chat and agree to include Mabel’s
idea; also to include Mabel as an author.
4. The paper is accepted, and presented at the conference
by Mabel (Kathleen is in Norway at another conference).
5. At the conference, Professor Marmsbury sees a critical
error in the paper.
6. Mabel says “It’s really not my paper, Kathleen wrote it, it’s
her error”.
7. … ?
143
144
Scenario: Formby and the X-rays
1. A PhD student, Formby, invents a googly algorithm and
applies it to 1996 chest X-ray data from Wentworth.
2. The algorithm runs well on this data.
3. He publishes the paper in GOOGLY2004
4. Section 5 of the paper is “Evaluation”, based on the 1996
Wentworth chest X-ray data.
5. A year later, Formby is depressed because he hasn’t
discovered any new googly algorithms for a while.
6. He applies his original googly algorithm to 1997 chest Xray data from Billingworth.
7. He submits a paper in GOOGLY2005, same as the 2004
paper, except that Section 5 uses the 1997 Billingworth
chest X-ray data.
8. … ?
145