How the World Bank built an enterprise taxonomy --

How the World Bank built an enterprise
taxonomy -- a story with a happy ending
Denise A. D. Bedford, Ph.D.
Senior Information Officer
World Bank
ASIST Potomac Valley Chapter presentation
November 19, 2003
Storytelling
• I’m going to use a traditional Knowledge Management tool tonight to
tell you how we built our enterprise taxonomy – storytelling
• My goal in using this approach is to illustrate the technical,
information architecture and the social aspects of such an
undertaking
• It will also allow me to speak to some of the critical foundation
elements and milestones in the process
• It would not be truthful for me to tell you a story about how one day
we defined our enterprise-taxonomy, and the next day we all lived
happily ever after!
• I’d like to take you back to the world of medieval fiefdoms – many
systems, many rules, different sets of laws, different languages and
grammars
Once upon a time
•
We had many different financial systems, multiple document management
systems, 100’s of searchable resources, and a number of gaps in coverage
of our information assets
•
Then a wise and foreseeing Chief Information Officer and President helped
us to establish a stable, standard institutional platform for our institutional
collections (…our modern day Alexander the Great)
•
This meant that instead of having multiple financial systems, human
resource systems, and document management systems, we had one to suit
each function (…first thoughts of unification arise…)
•
And, the wise counselors advised them to select systems that functioned on
a common operating system - Oracle (…we agree to talk to establish lines
of communication and send ambassadors)
•
Enterprise begins to think of systems at an ‘enterprise’ level – this is a
crucial organizational culture aspect to implementing an enterprise
taxonomy
Consolidation of Business System Fiefdoms
•Before the dawn of the Knowledge Age, we had many different business systems
•Each business system had its own (or no…) metadata, classification schemes, indexes, search
systems…
•When we standardized our primary business systems, we merged those different taxonomies
into enterprise taxonomies
•In this first step, we still had multiple business systems, but one per business function
Laying Out the Information Empire
•
Once we had established a common communication foundation, the people
in those different fiefdoms began to talk to one another and a cultural
change began to occur
•
The idea of having ‘one’ business system to support a business function
was accepted by the masses
•
Now we find we have many different kinds of taxonomies – accounting
structures, business functions/process/task taxonomies, product
taxonomies, taxonomies of job classes, skills taxonomies, organizational
taxonomies, personnel profiles, etc.
•
We built taxonomies in these business function systems as we were
implementing them - designed to suit business functions and the people
who were administering the systems, not necessarily end users
•
Start to understand important of usability and end-user training
From Business to Information Systems
• Then a wise counselor (information architect) had a vision of a
common enterprise-document management system
• When we began looking for such a system, though, the commercial
products were not up to snuff in terms of our requirements
• We developed our own in-house system – portions of which
were/were not using the common foundation
• The wise counselor had another vision of an integrated enterprise
information system that would support a single point of access to all
the information within the information empire
• This was the spark that set a the goal for an integrated enterprise
architecture and taxonomy, though we were not sure we could
actually achieve it
Document Management Systems
•Document management system was like a cathedral that held the church network together –
smaller churches represented the units contributing to the system
•Document management system architecture was a little bit different, though
•Took many years to convince the little churches to send their offerings to the cathedral so they
could become part of the larger network
•Each church could maintain their own filing structures which served the creators not the users
•Eventually they agreed to use a common prayer book – common filing structure
•Churches can speak different languages but they all have to be able to communicate
Monasteries
Document vs. Information
Management Systems
Distribution
• Caution here – goals of document and records management
systems are to store and preserve information from the perspective
of those who created the information
• End user access is not a primary goal of these kinds of systems
• Taxonomies that you put in place for these kinds of systems don’t
necessarily serve end users needs
• Kinds of taxonomies – organization filing structures, record series for
retention & dispositioning, economic sector and impact categories,
some minimal metadata is beginning to emerge, though
• These taxonomies serve filing and storage goals, not the information
access goal of our enterprise taxonomy
Renaissance – Creativity Explodes
• While we were making good progress in synchronizing different
kinds of taxonomies in all of these business areas, a creative
renaissance of knowledge creation and sharing began
• In about 1997, we launched a knowledge management initiative,
using Lotus Notes databases to support collaboration and document
libraries
• Knowledge management was a cultural change in itself – creativity
of organizational units was encouraged and heightened
• It was a very important source of cultural change within the
institution – beginning of a transformation to a learning organization
• It meant that the masses could become interested in taxonomies
Renaissance – Creativity Explodes
• Proliferation of writing, publishing and organizing of information
• Déjà vu all over again – creativity took the form of user-defined
metadata, publishing and navigation taxonomies
• These taxonomies were different from any of the taxonomies we had
seen before – reflected the new thematic structure of the KM
organization
• In some respects there was more confusion because they were
talking about different kinds of taxonomies but trying to fit them into
the same structures
• We began some internal QuickStart educational sessions on
metadata, taxonomies, search, semantic web, etc. to provide a
framework
Popular Information Revolution
• So now we have several business process systems, a decentralized
document management system, knowledge management system – and
there is a popular uprising – the web
• Many web towns are created - 100’s of web sites, 1000s of web pages
• No central coordination of virtual villages
• Too many different places to go to look for information – going back to the
medieval monastery network systems
• Masses begin to surface their discontent with the quality of access and
the quality of information that is being published
• Realization among the masses that not all of the quality information
assets are electronic or publicly available
Popular Information Revolution
• Begins to look like the Dark Ages again - no profiles, no taxonomies, no
controlled vocabularies or values
• Different systems have different profiles, different taxonomies, controlled
vocabularies or values, indexes, search systems
• We start to see information pollution – alchemists and court jesters come
back onto the scene – advocating magical approach to discovering the
enterprise architecture
• But, we didn’t give up – we kept working on the components of the
infrastructure in the background
• We knew that the day would come when they would be needed – and that
day came
Rationalism & Enlightenment
• Wise counselor returns to bring back sense of rationalism and
enlightenment
• Counselor commissions a synthesis of content types across systems,
standard metadata scheme, and the rejuvenation of the World Bank
Thesaurus
• Content of the information is what we focus on for integration
• Information architecture then derives from our kinds of content
• Synthesis and integration work outside of existing systems, but
leverages all the work that is done within the business systems
• Metadata is the central structure (faceted taxonomy)
• Reference sources for each facet support the governance and quality
control (flat, hierarchical and network taxonomy structures)
Scientific Revolution & Industrialization
• About this time, the visionary counselor begins to lay the work for a
superhighway connecting all information systems – using the
integrated enterprise taxonomy as a blueprint
• Content type proposal – enterprise-wide review of kinds of information
is completed and accepted by Information Architecture Committee
• Establishment of Bank standard metadata – deriving from existing
metadata across systems
• Long-term search strategy proposed and submitted to Information
Architecture Committee
• Simplified Enterprise Taxonomy for topics is formed – looking across
all systems and looking to the systems that are used by our partners
Space Travel - Portals
• The wild and crazy growth of the external website of the Bank, as well
as the need to create a new internal web services platform raised
awareness of the value of an integrated enterprise taxonomy
• You need some predictability in the source and target systems before
you can syndicate content from an SAP BW cube, a newsfeed source,
a DM system, an RM system, Archives, and the InfoShop to a project
portal or to a personal portal, they all need to have a common point of
reference
• The portal team tried the vendor’s suggested approach – create and
implement simple new hierarchies and use them throughout the portal
• The enterprise taxonomy actually becomes the technical and
information infrastructure of the portal – metadata repository, global
navigation bars, …
• Taxonomies also now must be an integral part of the content that you
are creating in the portals and in the systems that provide content to
the portals
Back to Communications
• Vision of a whole-Bank search – one place to go to find information in
any of the Bank’s systems, speaking any of the languages of our clients
• Vision involved having a search engine that spoke the Bank’s business
language and the languages of our clients – another kind of taxonomy
• We had a print-based ‘topical’ thesaurus which needed to be updated
and expanded to reflect the Bank’s business in 2000 (moved this from
10,xxx terms in 1997 to 92,xxx in 2003)
• Same time the Translations Department was implementing a new parallel
translation system which leverages multilingual and cross-language
glossaries
• Translations Department glossaries focus on business functions, WB
Thesaurus focuses on topics – integration and cross-population now in
progress
Transparency
• Policy on Information Disclosure (2002) approved by the Board of
Executive Directors required that we:
– develop a metadata based, cross-system Catalog to surface
disclosed and disclosable documents for the external public user
– put in place a system that would support the capture and
tracking of disclosure requests in the future and record changes
in disclosure status
– This effort funded the first release of whole-Bank search
• Disclosed and disclosable documents lived in all of those systems
above and were not tagged with their disclosure conditions or status
• In order to deliver WB Catalog, we had to integrate all of those
taxonomies described above as well as the long-term search
strategy
Information Universe
•
Let’s jump to the 21st century – Enterprise Content Architecture and Enterprise
Content Management
•
All those taxonomies we worked on for the past 15 years are now integral
components of the enterprise content architecture
•
We’re finding that these taxonomies are critical to efficient and effective use of
portal technologies
•
Allows us to shift the focus to information content, metadata management,
taxonomies, search, access, security, disclosure….
•
Now the impetus is to bring them all under central control so that they can be
managed and used by systems across the enterprise
•
Let’s see what the enterprise taxonomy looks like today, its content, how we
maintain and manage it
Information Universe
•
We realize that we really do want to work and travel in a 21st century universe of
information
•
Space travel is not magical, but is based on good engineering and maintenance
•
Managers need to understand that quick fixes and solutions do not result in
sustainable systems, but rather result in significant investment losses
•
A multi-dimensional design approach supports flexibility, extensibility, and
customization
•
We can view our information universe from several different perspectives
–
–
–
–
•
Individual systems landscape
A technical architecture landscape
User’s view of the enterprise taxonomy
An information architecture landscape
All of these views make up our Enterprise Content Architecture and allow us to
move to the next step – Enterprise Content Management
Systems Architecture
Site Specific
Searching
Publications
Catalog
World Bank Catalog/
Enterprise Search
Recommender
Engines
Personal
Profiles
Portal Content
Syndication
Browse &
Navigation
Structures
Metadata Repository
Of Bank Standard Metadata
(Oracle Tables & Indexes)
Reference Tables
Topics, Countries
Document Types
(Oracle data classes)
Transformation
Rules/Maps
Data
Governance
Bodies
Metadata
Extract
Doc Mgmt
System
Metadata
Extract
Metadata
Extract
Metadata
Extract
Metadata
Extract
People
Soft
JOLIS
Metadata
InfoShop
Metadata
SAP
Financial
System
Metadata
Extract
Web
Content
Mgmt.
Metadata
Concept Extraction, Categorization & Summarization Technologies
Technical View of the Enterprise Architecture
Content
Contributor
End User
Content Systems
Metadata
Management
and Security
Services
DELIVERY
access
rules
ePublish
Content Access Services
….
Content Management Services
view
multilingual srch
search
syndication
browsing
notification
retention
schedule
PDS
workflow
create/del.
check in/out
versioning
declare
classification
reference
data
taxonomy
thesaurus
Content Integration and Archives Services
relate
Connector
Concept
extraction
rules
evaluator
harmonize
Adapter
data dic.
monitors
Archives
Store
logs
Over
Time
SAP
(R/3, BW)
Documents,
Images, Audio,
Data records
Repositories Services
Metadata
warehouse
People
Soft
Notes /
Domino
iLAP
Business Systems
User’s View of the Enterprise Taxonomy
Information Architecture
Title
Author
Keyword
Content
Type
Topics
Bus.
Activity
Format
Disclosure
Bank Standard Metadata by Purpose
Identification/
Distinction
Search &
Browse
Use Management
Compliant Document
Management
Agent
Country
Authorized
By
Record Identifier
Title
Region
Rights
Management
Disposal Status
Date
Abstract/
Summary
Access
Rights
Disposal Review Date
Format
Keywords
Location
Management History
Publisher
Subject-SectorTheme-Topic
Use History
Retention
Schedule/Mandate
Language
Business
Function
Disclosure Status
Preservation History
Disclosure Review Date
Aggregation Level
Version
Series &
Series #
Content Type
Relation
Taxonomies in Action
• Metadata in Fielded Search – Faceted Taxonomy
• Topics Taxonomy – Shallow Hierarchy
• Business Activity Taxonomy – Deep Hierarchy
• Organizational Taxonomy – Faceted Taxonomy
• Country – Region Taxonomy – Hierarchy
• Thesaurus in Search – Faceted Taxonomy
• Disclosure Status – Flat Taxonomy
Top Tier Content Type Examples
• Documents in IRIS, ImageBank, IRAMS…
• Data in BW, DEC SIMA queries in central, regional & agency databases,
CDF indicators, GDF data reports, .
• Publications in JOLIS, Office of Publisher, Thematic Group databases…
• Communications in External Affairs, Office of President, DEC, IRIS…
• People & Communities in YourNet, PeopleSoft, WBDirectory,…
• Knowledge in Notes databases, Oral History program,…
• Services in WB Yellow Pages, Service Portal,…
• Collections in EIU database, Oxford Analytica
Lessons Learned
• You can change some of the information architecture, but some of it
you will have to adapt or map
• Business functions are the most critical for standardizing to single
business taxonomy – the move towards standardization has to come
from above
• Map business system taxonomies to enterprise taxonomies - help
the business system owners to see the value of being part of an
enterprise taxonomy (no value, no buy in)
• Expect change and be ready to integrate and map, but educate your
users to alert you to changes – make it possible for them to work
with you
• Do outreach and consciousness raising (QuickStart programs on
metadata, taxonomies 101, search engines, semantic engines,…
Lessons Learned
• Move forward on the end user front while you’re working on the
backend – when people can see the actual value they will buy in
(now no one wants to be left out of the WB Catalog now – we
created it, so they are coming)
• Have to have a goal and a vision – you will never succeed at
creating an enterprise taxonomy if you don’t know why you’re doing
it
• We are putting in place an enterprise architecture based on welldefined and managed taxonomies that are used within and by
internal systems
• This gives us flexibility to build different products and views for end
users, while internally managing our information assets