Workshop on: “Grand challenges in data integration for - ISSI

Workshop on:
“Grand challenges in data integration for research and innovation
policy: handling big data, coping with quality issues and anticipating
new policy needs. State of the art and future perspectives”
ISSI 2015 CONFERENCE, Istanbul
29 June 2015
Room: Kriton Curi Hall, time: 14:00-17:00
Organized by Cinzia Daraio, E-mail: [email protected], University of Rome La Sapienza
Summary: The fast growing availability of open and linked data; the rapid evolution of big data;
the wider perspective opened by the altmetrics movement; the multidimensionality and complexity
of research assessment; the needs to overcome the logic of mono-dimensional and biased rankings
together with the more and more demanding policy needs ask for new ways of data integration
and interoperability among many heterogeneous data sources. This workshop aims to critically
discuss with some of the best known experts in the field, the main problems related to the
integration of heterogeneous sources of data, such as data quality issues; comparability problems;
standardization, interoperability and modularization; creation of concordance tables among
different classification schemes; extensibility and updating of the integrated database constructed
by integrating existent independent and heterogeneous databases.
Introduction and background
The fast growing availability of open and linked data; the rapid evolution of big data into a big data
science; the wider perspective opened by the altmetrics movement with respect to traditional
bibliometrics; the multidimensionality and complexity of research assessment; the needs to
overcome the logic of mono-dimensional and biased rankings together with the new trends in
granularity and cross-referencing of science and technology (S&T) indicators and the more and
more demanding policy needs ask for new ways of data integration and interoperability among
many heterogeneous data sources.
There have been recent efforts from policy makers to support the creation of new datasets in
Education, Science, Technology and Innovation.
In the US the STAR METRICS and the Science of Science and Innovation Policy initiatives are a clear
example of these efforts.
In the European context there have been several initiatives, such as, Aquameth, the pioneering
project on the microdata of European higher education institutions, that lead to the Eumida
(European Universities Microdata) feasibility study that has been consolidated in the European
Tertiary Education Register (ETER); the mapping of diversity of European institutions through the
U-Map project that lead to an institutional based effort to build a multidimensional rankings of
universities (U-Multirank). Beside, two large surveys on the European Research Area (ERA) were
launched in 2013 and in 2014 to gather comprehensive information on the activities carried out
by funding and performing research institutions.
At the same time, there have been parallel initiatives to those cited above for standardizing the
elementary pieces of information supported by international scientific associations (see for
example CODATA and the VIVO network of scientists) as well as non-profit and community-driven
organizations, such as, ORCID which represents the effort to provide a registry of unique
researcher identifiers; CERIF aiming at standardizing the operations of funding agencies; CASRAI
which aims at the standardization of data on research institutions and funders; ISNI which
provides lists and metadata on higher education, research, funding and other types of
organizations, Ringgold which refers mainly to publishers activity.
Critical issues
All existing initiatives, however, do not solve the main problems related to the integration of
heterogeneous sources of data, such as, data quality issues; comparability problems;
standardization, interoperability and modularization; creation of concordance tables among
different classification schemes; extensibility and updating of the integrated database constructed
by integrating existent independent and heterogeneous databases.
2
Main objective of the Workshop
The main objective of the workshop is to make the point on where we are and where we are going
about these critical issues with several experts, among the best known in the field, and with the
workshop attendants, that will all contribute with their background, their experiences and projects
carried out on these issues.
Experts invited to the panel discussion within the workshop
Isidro Aguillo, Cybermetrics Lab, CSIC, Madrid
Andrea Bonaccorsi, University of Pisa
Wolfgang Glanzel, KU University, Leuven
Stefanie Haustein, EBSI, Université de Montréal
Stefan Hornbostel, iFQ and Humboldt University, Berlin
Sybille Hinze, iFQ, Berlin
Marc Luwel, Hercules Foundation and CWTS, Leiden
Henk F. Moed, Sapienza University of Rome
3
Some “hot” questions on “data integration for research and innovation policy: handling big data,
coping with quality issues and anticipating new policy needs” for the panelists and the
workshop attendants
Data-collection initiatives in Europe, US and all over the world
1. In Europe ETER and U-MULTIRANK will complete their activities in 2015. The ERA surveys were
run up to 2014. What will be next? And what about US and the rest of the world? What is the
future of the existing initiatives on the issues recalled in the introduction?
Options and costs
2. What are the options that the academic community envisages?
3. What are the estimated costs of the alternative options? What is the cost of non-action?
Open data, linked data and platforms for Science, Technology and Innovation: can they succeed?
4. In this context, open-data, open linked data and open platforms, can they succeed? What are
the main obstacles to their implementation?
Monitoring evaluation systems
5. How to track and monitor the consequences of the evaluation of research activities on the
behaviour of the evaluated scholars? How to find out and face opportunistic behaviours? How to
monitor the impact of the changes of the indicators used in the evaluation activity on the overall
system?
Stakeholders, actions and sustainability
6. What are the stakeholders expectations on these subjects?
7. What are the actions that need to be taken by stakeholders, by policy makers and by the
scientific community on these subjects?
8. What is a sustainable model to propose to policy makers? Which one has to be the strategic
plan for the long run? And what is advisable to do in the short run?
4
Detailed program of the workshop Monday 29 June 2015
Room: Kriton Curi Hall, time: 14:00-17:00
14:00-14:15 Introduction, Cinzia Daraio, Sapienza University of Rome
14:15-15:25 Presentations by:
-) The Challenge of Data Collection, Harmonisation and Standardisation for Use in Research
Funding, Output Measurement and Research Assessment, Wolfgang Glanzel, KU University,
Leuven
-) Heterogeneity of data in research assessment, Marc Luwel, Hercules Foundation and CWTS,
Leiden
-) The economics of S&T indicators, Andrea Bonaccorsi, university of Pisa and ANVUR
-) Multidimensionality of research assessment and issues of data integration, Henk F. Moed,
Sapienza University of Rome
15:25-15:45 Statements and discussions by:
-) Stefanie Haustein, EBSI, Université de Montréal, Montréal
-) Stefan Hornbostel, iFQ and Humboldt University, Berlin
15:45-16:00 Hints from the workshop “Google Scholar and related products”
-) Isidro Aguillo, CSIC, Madrid
16:00-16:15 Open discussion and questions
16:15-16:35 Feedbacks and comments by panelists
16:35-17:00 Summary, conclusions and next appointments (Cinzia Daraio, Sapienza University of
Rome; Wolfgang Glanzel, KU University, Leuven; Sybille Hinze, iFQ, Berlin).
5