Master`s Programme in Data Science

Notes on Data Science
University of Gothenburg
2015-04-13
Ä6 ITFS 2015-04-16
Master’s Programme in Data Science
Introduction
Data Science is concerned with extracting meaning from big data. Central topics
within Data Science include data mining, machine learning, databases, and the
application of data science methods in natural sciences, life sciences, humanities and
social sciences, as well as in industry and society.
Education in data science can take many forms. Learning outcomes include
knowledge and skills related to the computational techniques needed to process and
analyse large data sets. On the technical side, this can include detailed understanding
of relevant computational and statistical methods. It can also include recognising how
these methods are applied in different fields, and the challenges of working with large
data sets in different disciplines.
A multi-disciplinary programme
The wide applicability of Data Science methods, and the rise of data-intensive
research in different areas, means that Data Science education is relevant across many
faculties. Thus, there is potential for a truly multi-disciplinary education programme
in Data Science. Even the narrower technical core of Data Science requires expertise
spanning computing science and statistics.
Target student groups
One target group is students with a background in a relevant technical area
(computing science, statistics, etc.) seeking a programme that includes advanced
technical courses in Data Science.
There is also a possibility to recruit students with different backgrounds who want to
move into the Data Science field. These students would need introductory courses to
give basic knowledge in programming and statistics. Some students might prefer
courses that focus on the application of Data Science, rather than going so deeply into
the technical aspects of Data Science.
Courses
Hopefully it will be possible to make use of some existing courses from other
programmes. Similarly, courses developed primarily for the Data Science programme
could be of interest to student in other programmes.
Technical courses in Data Science
A new course “Introduction to Data Science” in reading period 1 would give an
overview of the subject, and set the agenda for the rest of the programme. This course
could contain guest lectures from teachers of later courses in the programme giving
introductions to the topics of those courses, and also guest lectures from industry.
Practical work in this course could include exercises using existing machine learning
and data mining tools, so that students get some hand-on experience in analysing data
sets. Detailed examination of the underlying methods used by these tools could be
deferred to later courses.
1
Notes on Data Science
University of Gothenburg
2015-04-13
There are several existing courses that could fit into a Data Science Masters’s
programme. Examples include “Statistical learning for big data” (Mathematical
Sciences), “Algorithms for machine learning and inference” (CSE), “Information
Visualization” (Applied IT) and “Algorithms” (CSE).
There are several graduate courses at Computer Science and Engineering that could
potentially be adapted for a Data Science Master’s programme. Examples include
“Decision making under uncertainty” given by Christos Dimitrakakis, a course on
“Learning, privacy and security” by Katerina Mitrokotsa and “Ethics and Philosophy
of Computing” by Moa Johansson.
There has not yet been a discussion among staff at Computer Science and
Engineering, Applied IT and Mathematical Sciences about other existing courses that
could be used, or new courses that could be developed. Possible topics for
consideration include an advanced course in databases. The ongoing recruitment of a
senior lecturer in Data Science at the University of Gothenburg should result in
opportunities for specialised courses in the successful applicant’s area.
Applications of Data Science
There could potentially be courses across the whole university that could fit into a
Master’s Programme in Data Science. Examples include courses in the existing
Master’s programme in Language Technology and the planned programmes in Digital
Humanities and Investigative Journalism. Some courses are mentioned under
“Relationship to other programmes” (below).
There are also other courses at Chalmers that fit well with Data Science. Examples
include “Bioinformatics” (Mathematical Sciences), “eHealth” (Signals and Systems),
and “Geographical Information Systems” (Architecture). There have not yet been any
discussions about whether these courses could be made available to GU students in a
Data Science programme, and whether their prerequisites could be fulfilled.
Courses for students from outside computing/statistics
An introductory programming course is needed, ideally to run in reading period 1.
This could be an existing programming course in Java (although these are mainly
taught in Swedish, so could not be taken directly by international Master’s students).
The Master’s Programme in Language Technology includes an “Introduction to
programming” in reading period 1, where the programming language Python is used;
this could be a good choice for Data Science.
An introductory statistics course is needed, ideally to run in reading period 2. It would
be good to develop a new course (e.g. “Statistics for Data Science”) that introduces
concepts from the perspective of big data, rather than the perspective usually taken in
introductory statistics courses.
There is an existing introductory course on “Databases” (DIT620) that runs in
reading period 2 and again in reading period 3. This course focuses on the design,
implementation and usage or (mainly) relational databases.
These courses might also be relevant for students who have a first degree in a
technical subject. For example, students with a background in mathematics might not
have courses in programming or databases in their previous studies.
For students with backgrounds outside computing/statistics, the first semester could
consist of “Introduction to Data Science” and “Introduction to programming for Data
2
Notes on Data Science
University of Gothenburg
2015-04-13
Science” in reading period 1, followed by “Statistics for Data Science” and
“Databases” in reading period 2.
Relationship to other programmes
Computer Science
The existing Computer Science Master’s programme (120 hec) is a broad programme
that is closely connected to research at the Department of Computer Science and
Engineering. Some courses could be common to both the Computer Science and Data
Science programmes. The multi-disciplinary character of Data Science and its focus
on data, give the Data Science programme a distinct profile.
Language Technology
The existing Language Technology Master’s programme (60 hec or 120 hec) has
many courses that are relevant to data science. The responsible department is
Philosophy, Linguistics and Theory of Science, in cooperation with Computer Science
and Engineering and Applied Information Technology.
Relevant courses for Data Science could include “Statistical Methods”, “Machine
learning for NLP”, “Information retrieval” and “Knowledge representation and
inference”.
Digital Humanities
A new programme in Digital Humanities (120 hec) is being proposed for autumn
2017, lead by the Department of Literature, History of Ideas and Religion.
Relevant courses for Data Science could include “Methods for text analysis” and
“Methods for visualisation and image analysis”
The set of four courses listed under “Courses for students from outside
computing/statistics”.(above) could be an attractive set of elective courses for Digital
Humanities students in the second year of their programme.
Investigative Journalism
A new programme in Investigative Journalism (60 hec) is being proposed for autumn
2016, lead by the Department of Journalism, Media and Communication (JMG).
Relevant courses for Data Science could include “Data journalism and visualisation”
(15 hec).
JMG would be interested in a dual degree in Journalism and Computer Science,
similar to that offered by Columbia University.
3