sami.bl.uk

Enhancing discovery
of the British Library’s
audio collections
Richard Ranft
23 June 2014
Making Metadata Work ISKO UK +
IRSG + DCMI joint meeting
Discovering the British Library’s audio
collections
• the collections
• discovery and access
• improving discovery and access
www.bl.uk
2
The British Library’s audio collections
• established 1955 (as British Institute of Recorded Sound)
• national collection of UK record industry
• selected publications from overseas
• radio broadcasts
• unpublished recordings
www.bl.uk
3
Subjects
• music
• spoken word
• environments & nature
www.bl.uk
4
Extent
• 8 million tracks
• from 1857 to this morning
• many formats
• total 115 years of listening
www.bl.uk
5
Barriers to access
• copyrights
• many non-digital tracks
• offline digital
• time-based = time consuming
• limited, text-based search
• no serendipity
• high expectations (c.f. iTunes, Spotify)
www.bl.uk
6
• ‘opacity’ of
audio (no
freezeframes!)
Current access
• Sound & Moving Image
Catalogue: sami.bl.uk
• onsite listening:
– Listening & Viewing Service
– SoundServer (200,000 tracks,
2.5% of collections)
• off site listening:
– BL Sounds (50,000 tracks, 0.6%)
• streaming
• downloading
www.bl.uk
8
Sound & Moving Image Catalogue
sami.bl.uk
www.bl.uk
9
Existing web services
Human-led enrichment
• description
• transcription
• annotation
• category tagging
• rating
• recommendation & review
www.bl.uk
11
Machine enrichment/search
Categorisation
Music genre, language/dialect
detection, mood
Synchronisation
Score following
Transcript following
Identification
Speaker/vocalist ID
Melody recognition
Query by humming/tapping
Non-text browsing
Map browse
Timeline browse
Recommendation & matching
melody matching
Cross-media linking
Speaker/ tune matching
Feature extraction
Pitch, tempo, chord, time
signature, rhythm
Segmentation/event detection
Music/speech segments
Speaker/ lead instrument change
Laughter, applause, emotion
detection
Transcription
Speech-to-text
Score generation
Click to add title

Bullet 1
– Bullet2
• Bullet 3
13
Click to add title

Bullet 1
– Bullet2
• Bullet 3
14
BL Sounds
• Improving
access and
discovery
• http://sounds.bl.uk/
Visualisation and analysis
• Centre de Recherche en
Ethnomusicologie (CREM)
http://archives.crem-cnrs.fr/
• Powered by Telemeta
http://telemeta.org/
Current projects
• work with Metable and record labels to acquire and describe
digital music
• search via APIs across open music databases such as
MusicBrainz, Decibel, Discogs
• COMMA: cloud-based media analysis project with BBC
http://www.bbc.co.uk/rd/projects/comma
www.bl.uk
23
Example
http://sounds.bl.uk/Arts-literature-and-performance/Earlyspoken-word-recordings/024M-1CS0011556XX-0200V0
English Conversation: At the Tobacconist's (1929)
Linguaphone 78rpm
www.bl.uk
26
Thanks for listening!
[email protected]
www.bl.uk
28