Trove’s Application Programming Interface Electronic Resources Australia Annual Forum Sydney 10 July 2012 Debbie Campbell Director Collaborative Services National Library of Australia CC BY 3.0 http://creativecommons.org/licenses/by/3.0/au/ 2 Surveying the possibilities The API should provide an ability to download sets of records based on one of the following criteria, or a subset of them: in a particular format such as images, maps, articles, people and organisations, digitised newspaper articles with a particular tag from a particular contributor with a particular type of content, for example Tasmanian content created by a particular set of authors, such as written by researchers from Griffith University all commentary collected by Trove. with an ability to receive updates according to a schedule varying from daily to quarterly, and then integrate the records into discovery services ranging from commercial platforms to in-house solutions or open source solutions. 3 Positioning Trove to make Australian content more discoverable allow Trove’s data to be used in ‘mash-ups’ and other services make enhanced metadata (“collective intelligence”) available to collaborators promote the analysis of Australian collections in memory institutions support academic research 4 Questions explored Are there records or content the National Library does not have permission to redistribute? The records in Trove are not all in the same format (data schema) – will this be a problem? Will sending out works instead of individual item records be OK? Where access conditions and copyright licences are made explicit, such as Creative Commons licences, how do we continue to emphasise this? Will lots of simultaneous downloading require additional support from the Trove platform? 5 The content Are there records or content the National Library does not have permission to redistribute? MaRC records from the National Bibliographic Database, especially those purchased specifically for the use of Libraries Australia members book cover art images of the Australian Women’s Weekly, and other newspaper content post 1955 websites archived in PANDORA 6 The content Records available brief records from Libraries Australia: 17m+ OAIster, Hathi Trust newspapers – titles, articles... people and organisation records, in their own API... wiki.nla.gov.au/display/ARDCPIP/Party+Infrastructure+APIs 7 The data schema The records in Trove are not all in the same format (data schema) – will this be a problem? Records are provided to Trove as: MaRC/Resource Description & Access (RDA), from the National Bibliographic Database Dublin Core, from university and cultural heritage repositories EAC-CPF (Encoded Archival Context for Corporate Bodies, Persons and Families), in XML, for the people and organisations zone of Trove Records are available for lists too In return, the API provides Qualified Dublin Core brief and longer records are provided (in JSON or XML) 8 The work Will sending out works instead of individual item records be OK? 9 The access provisions Where access conditions and copyright licences are made explicit, such as Creative Commons licences, how do we continue to emphasise them? Default position: To carry forward any conditions made explicit in the records including the ‘best’ online status is available Exceptions are negotiated upfront, because the NLA doesn’t monitor the copyright, and data exceptions are a load on the efficiency of the service itself 10 The downloading Will lots of simultaneous downloading require additional support? a limited number of queries per minute/hour - 100/6K Millions Trove Work Count by Zone 350 300 250 200 100 50 May-12 Mar-12 Jan-12 Nov-11 Sep-11 Jul-11 May-11 Mar-11 Jan-11 Nov-10 Sep-10 Jul-10 0 May-10 the limit was tested by half a dozen beta test sites 150 Mar-10 the quantity of responses will depend on the query 11 Trove Terms of Use 12 Who is using the Trove API 68 intrepid individuals, some representing research institutions recent sample declarations: • • • • • • • • • • • integration with a research metadata repository...; saving results from Trove for thesis research and perhaps using them in a database for my research; testing for ANDS projects; ...I will be experimenting with using the api with our catalogue to improve our user experience. I am at the discovering "what is possible" stage, rather than having any fixed plans; Windows 8 App; Discoverability tool for non-English language content; Personal use for playing with data visualisation and mashups etc; Personal local history and genealogical research; To see how i can remix data and pull data from it to see if it has any uses for making my own apps or for the library I work for; Using it to develop a search [tool] for all Qld Govt Libraries; An internal transcription tool for the South Australian Genealogy and Heraldry Society Inc (a not for profit organisation)... interested in creating indexes of shipping entries. ‘.. The possibilities are endless...’ 13 Who is using the Trove API Five days to harvest four million newspaper articles, and analyse them like this: Tim Sherratt, discontents, http://discontents.com.au/ 14 15 Tim Sherratt: Mining for meanings 16 Tim Sherratt: Mining for meanings 17 18 References Introduction to the Trove API http://trove.nla.gov.au/general/api Trove API Terms of Use http://trove.nla.gov.au/general/api-termsofuse Trove Contact Us – for further assistance http://trove.nla.gov.au/contactus 19
© Copyright 2024