Notes on Data Science University of Gothenburg 2015-04-13 Ä6 ITFS 2015-04-16 Master’s Programme in Data Science Introduction Data Science is concerned with extracting meaning from big data. Central topics within Data Science include data mining, machine learning, databases, and the application of data science methods in natural sciences, life sciences, humanities and social sciences, as well as in industry and society. Education in data science can take many forms. Learning outcomes include knowledge and skills related to the computational techniques needed to process and analyse large data sets. On the technical side, this can include detailed understanding of relevant computational and statistical methods. It can also include recognising how these methods are applied in different fields, and the challenges of working with large data sets in different disciplines. A multi-disciplinary programme The wide applicability of Data Science methods, and the rise of data-intensive research in different areas, means that Data Science education is relevant across many faculties. Thus, there is potential for a truly multi-disciplinary education programme in Data Science. Even the narrower technical core of Data Science requires expertise spanning computing science and statistics. Target student groups One target group is students with a background in a relevant technical area (computing science, statistics, etc.) seeking a programme that includes advanced technical courses in Data Science. There is also a possibility to recruit students with different backgrounds who want to move into the Data Science field. These students would need introductory courses to give basic knowledge in programming and statistics. Some students might prefer courses that focus on the application of Data Science, rather than going so deeply into the technical aspects of Data Science. Courses Hopefully it will be possible to make use of some existing courses from other programmes. Similarly, courses developed primarily for the Data Science programme could be of interest to student in other programmes. Technical courses in Data Science A new course “Introduction to Data Science” in reading period 1 would give an overview of the subject, and set the agenda for the rest of the programme. This course could contain guest lectures from teachers of later courses in the programme giving introductions to the topics of those courses, and also guest lectures from industry. Practical work in this course could include exercises using existing machine learning and data mining tools, so that students get some hand-on experience in analysing data sets. Detailed examination of the underlying methods used by these tools could be deferred to later courses. 1 Notes on Data Science University of Gothenburg 2015-04-13 There are several existing courses that could fit into a Data Science Masters’s programme. Examples include “Statistical learning for big data” (Mathematical Sciences), “Algorithms for machine learning and inference” (CSE), “Information Visualization” (Applied IT) and “Algorithms” (CSE). There are several graduate courses at Computer Science and Engineering that could potentially be adapted for a Data Science Master’s programme. Examples include “Decision making under uncertainty” given by Christos Dimitrakakis, a course on “Learning, privacy and security” by Katerina Mitrokotsa and “Ethics and Philosophy of Computing” by Moa Johansson. There has not yet been a discussion among staff at Computer Science and Engineering, Applied IT and Mathematical Sciences about other existing courses that could be used, or new courses that could be developed. Possible topics for consideration include an advanced course in databases. The ongoing recruitment of a senior lecturer in Data Science at the University of Gothenburg should result in opportunities for specialised courses in the successful applicant’s area. Applications of Data Science There could potentially be courses across the whole university that could fit into a Master’s Programme in Data Science. Examples include courses in the existing Master’s programme in Language Technology and the planned programmes in Digital Humanities and Investigative Journalism. Some courses are mentioned under “Relationship to other programmes” (below). There are also other courses at Chalmers that fit well with Data Science. Examples include “Bioinformatics” (Mathematical Sciences), “eHealth” (Signals and Systems), and “Geographical Information Systems” (Architecture). There have not yet been any discussions about whether these courses could be made available to GU students in a Data Science programme, and whether their prerequisites could be fulfilled. Courses for students from outside computing/statistics An introductory programming course is needed, ideally to run in reading period 1. This could be an existing programming course in Java (although these are mainly taught in Swedish, so could not be taken directly by international Master’s students). The Master’s Programme in Language Technology includes an “Introduction to programming” in reading period 1, where the programming language Python is used; this could be a good choice for Data Science. An introductory statistics course is needed, ideally to run in reading period 2. It would be good to develop a new course (e.g. “Statistics for Data Science”) that introduces concepts from the perspective of big data, rather than the perspective usually taken in introductory statistics courses. There is an existing introductory course on “Databases” (DIT620) that runs in reading period 2 and again in reading period 3. This course focuses on the design, implementation and usage or (mainly) relational databases. These courses might also be relevant for students who have a first degree in a technical subject. For example, students with a background in mathematics might not have courses in programming or databases in their previous studies. For students with backgrounds outside computing/statistics, the first semester could consist of “Introduction to Data Science” and “Introduction to programming for Data 2 Notes on Data Science University of Gothenburg 2015-04-13 Science” in reading period 1, followed by “Statistics for Data Science” and “Databases” in reading period 2. Relationship to other programmes Computer Science The existing Computer Science Master’s programme (120 hec) is a broad programme that is closely connected to research at the Department of Computer Science and Engineering. Some courses could be common to both the Computer Science and Data Science programmes. The multi-disciplinary character of Data Science and its focus on data, give the Data Science programme a distinct profile. Language Technology The existing Language Technology Master’s programme (60 hec or 120 hec) has many courses that are relevant to data science. The responsible department is Philosophy, Linguistics and Theory of Science, in cooperation with Computer Science and Engineering and Applied Information Technology. Relevant courses for Data Science could include “Statistical Methods”, “Machine learning for NLP”, “Information retrieval” and “Knowledge representation and inference”. Digital Humanities A new programme in Digital Humanities (120 hec) is being proposed for autumn 2017, lead by the Department of Literature, History of Ideas and Religion. Relevant courses for Data Science could include “Methods for text analysis” and “Methods for visualisation and image analysis” The set of four courses listed under “Courses for students from outside computing/statistics”.(above) could be an attractive set of elective courses for Digital Humanities students in the second year of their programme. Investigative Journalism A new programme in Investigative Journalism (60 hec) is being proposed for autumn 2016, lead by the Department of Journalism, Media and Communication (JMG). Relevant courses for Data Science could include “Data journalism and visualisation” (15 hec). JMG would be interested in a dual degree in Journalism and Computer Science, similar to that offered by Columbia University. 3
© Copyright 2024