Non-Agricultural Databases and Thesauri: Retrieval of Subject

Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
Non-Agricultural Databases and Thesauri: Retrieval of Subject Headings and Non-Controlled
Terms in Relation to Agriculture
Tomaz Bartol
University of Ljubljana, Department of Agronomy, Biotechnical Faculty, Jamnikarjeva 101, 1001
Ljubljana, Slovenia
ABSTRACT 1
Purpose – The paper aims to assess the utility of non-agriculture-specific information systems,
databases, and respective controlled vocabularies (thesauri) in organising and retrieving agricultural
information. The purpose is to identify thesaurus-linked tree structures, controlled subject
headings/terms (heading words, descriptors), and principal database-dependent characteristics and
assess how controlled-terms improve retrieval results (recall) in relation to free-text/uncontrolled
terms in abstracts and document titles.
Design/methodology/approach – Several different hosts (interfaces, platforms, portals) and
databases were used: CSA Illumina (ERIC, LISA), Ebscohost (Academic Search Complete,
Medline, Political Science Complete), Ei-Engineering Village (Compendex, Inspec), OVID
(PsycINFO), ProQuest (ABI/Inform Global). The search-terms agriculture and agricultural and
truncated word-stem agricultur* were employed. Permuted (rotated index) search fields were used
to retrieve terms from thesauri. Subject-heading search was assessed in relation to free-text search,
based on abstracts and document titles.
Findings – All thesauri contain agriculture-based headings; however, associative, hierarchical and
synonymous relationships show important inter-database differences. Using subject headings along
with abstracts and titles in search syntax (query) sometimes improves retrieval by up to 60%.
Retrieval depends on search fields and database-specifics, such as auto-stemming (lemmatization),
explode function, word-indexing, or phrase-indexing.
Research limitations/implications – Inter-database and host comparison, on consistent principles,
can be limited because of some particular host- and database-specifics.
Practical implications – End-users may exploit databases more competently and thus achieve
better retrieval results in searching for agriculture-related information.
Originality/value – The function of as many as 10 databases in different disciplines in providing
information relevant to subject matter that is not a topical focus of databases is assessed.
Note: This is an extended and updated version of a selected paper from MTSR'11.
KEYWORDS: thesauri, controlled vocabularies, indexing, subject headings, databases, agriculture
1
Introduction
The most important share of relevant global agricultural information is indexed by three
international bibliographic databases, Agris, Agricola, and CAB Abstracts, compiled and organised
by the FAO (Food and Agriculture Organization) of the United Nations, NAL (National
Agricultural Library of the United States Department of Agriculture), and CABI (CAB
International), respectively. A specialised database FSTA (Food Science and Technology
Abstracts), is compiled by IFIS (International Food Information Service). CAB Abstracts is
considered the most comprehensive global agricultural database and has a very complex thesaurus
1
Author's version of the article in: Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
(CAB Thesaurus). Agricola is to a large extent also an American agricultural electronic library,
linked to the library holdings of the NAL. It is indexed by the NAL Agricultural Thesaurus. Agris
database and information system, which is linked to Agrovoc Thesaurus and compiled by the
national centres of the FAO member nations, has an important role as a national agricultural
electronic bibliography of many countries. The above agricultural systems and respective thesauri
have frequently been investigated from the perspective of information science, bibliometrics
(scientometrics). Some of these agricultural vocabularies are being remodelled as ontologies
(Soergel et al., 2004; Sicilia, 2008). The multilingual Agrovoc thesaurus has become an important
system for organization of knowledge in open archive networks (Subirats et al., 2008). It was
compared with other agricultural thesauri with regard to indexing of the same materials (Bartol,
2009). Also, it was mapped with several vocabularies (Morshed et al., 2010), and is increasingly
being used in non-English environments (Hazman et al., 2009). Besides these well known
bibliographic information systems, agricultural information is increasingly organised in other types
of information systems, for example learning depositories for educational purposes (Manouselis et
al., 2010).
But even the most useful databases, accessible through particular information systems, may account
for only a small part of possibly relevant literature (Hood and Wilson, 2001). Frequently,
multidisciplinary databases provide better index coverage than any single-subject database, as was
shown on the example of 12 different databases, including Medline, Sociological Abstracts, and
Social Sciences Citation Index (Walters and Wilder, 2003). Agriculture is a rather broad research
field and can involve economic aspects, rural social topics, engineering, and much more. Such
aspects can thus be found in many different disciplines and respective electronic services and can be
indexed by many different controlled vocabularies. The Agricultural CAB thesaurus was assessed
along with the Inspec and LISA (Library and Information Science Abstracts) thesauri with regard to
comparison between database subject headings (descriptors) and author keywords (Gil-Leiva and
Alonso-Arroyo, 2007). Bartol and Hocevar (2005) compared author-affiliation-retrieval in CAB
Abstracts, Web of Science, Biosis Previews, Chemical Abstracts, Compendex/Inspec
(Computerized Engineering Index/Information Service for Physics, Electronics, and Computing),
Francis, Medline, Pascal, and Sociological Abstracts. The retrieval of agricultural subjects was also
used by DeLong (2007) on an example of agriculture in a general academic Ebsco database,
Academic Search Elite. Some other subject-heading characteristics need to be taken into account,
for example multi-word phrases which are prone to variation (Savary and Jacquemin, 2003).
Such comparisons of agricultural and other, non-agricultural databases or thesauri are rare.
However, some other unrelated information systems, which are also explored in this research, were
compared with each other. The subject coverage of LISA, Medline, Compendex, and other
databases was investigated, although with regard to issues other than agriculture (Jacso, 1997).
Some biomedicine-specific issues were investigated on an example of Inspec thesaurus (Morris,
2001). The Sociological Indexing Terms were compared with biomedical MeSH (Medical Subject
Headings) and the Thesaurus of Psychological Index Terms by Weaver (2002). The accessibility
through different hosts was addressed on an example of Inspec (Salisbury and Gupta, 2004). Other
aspects involving databases and thesauri were also investigated, such as differences between textbased retrieval and subject headings. An early study by Parker (1971) compared retrieval based on
controlled and natural language terms. Retrieval by titles, abstracts, and subject headings, in
Compendex, was investigated by Byrne (1975). Controlled (subject headings) and uncontrolled
subject descriptions (word-stems from titles and abstracts) produce similar levels of performance in
retrieval and are thus complementary (Shaw Jr, 1993). Jenuwine and Floyd (2004) also concluded
that MeSH descriptors and text-words should be used together for maximal retrieval. A similar
study, on an example of the Thesaurus of Social Science Terms and Synonyms, also found that
combining both controlled-vocabulary and free-text terms yielded more relevant items than either
method alone (Knapp et al., 1998). This must evidently be an efficient way to improve retrieval
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
recall. Namely, some studies found no consistency between terms that occur in the abstract and title
and terms that are used as subject headings (DeLong, 2006).
Many differences exist not only among different databases, which are field-specific anyway, but
also in the same database available through different hosts or platforms. Retrieval of records from
the same database can vary depending on the interface, as was shown on the examples of Dialog
DataStar; EBSCOhost and OVID (Younger and Boddy, 2009), and free open-access platforms do
not offer some of the important features that are available through commercial versions of platforms
(Brown, 2003). Important retrieval differences among two different platforms, based on the
example of Biological Abstracts, were investigated by Bandyopadhyay (2010). Some other
differences in platforms, acting as a gate to the same database, were assessed by Sewell (2011),
such as thesaurus access, search history, and export features on the example of CAB Abstracts.
Dialog employs an automatic right-hand truncation, as opposed to EBSCOhost, rendering it
difficult to compare retrieval results on the same principles, so Younger and Body (2009) advise the
use of truncation and suggest more user education with regard to the database and interface
specifics. Namely, in a user-study on Compendex (Ei Thesaurus, or Engineering Index Thesaurus)
only one student paid attention to truncation (Anghelescu et al., 2005). Users’ comprehension of
thesauri is limited, as was shown on the example of ProQuest Controlled Vocabulary (Greenberg,
2004), so more user education is recommended.
Agriculture-related retrieval has frequently been assessed on examples of the three leading global
agricultural databases. This study, however, aims to investigate such bibliographic information
systems (databases) which are, probably, seldom used in the context of retrieval of agricultural
information. We expect these systems to be a valuable source of such information, however. The
purpose is to identify the principal characteristics of respective non-agricultural controlledglossaries or thesauri (subject headings) and retrieval procedures, based on some shared general
terms. The terms in the midst of the thesaurus hierarchy (henceforth referred to as 'middle terms')
are probably very different among databases, given a very different subject coverage of each
respective database. This difference applies even more to the terms at the end of hierarchical
branches ('leaf-terms'). So the choice of shared terms is limited. However, we do expect very
general terms, such as agriculture and agricultural to be included in all respective thesauri as
indexing terms, and accordingly they can be used for subsequent retrieval so we aim to thoroughly
investigate and compare thesauri based on these terms. We wish to ascertain how these differences
may affect information retrieval. Subject-heading based retrieval, in such systems, depends not only
on indexing principles, but also on database- and host-specific characteristics which determine the
most appropriate search techniques, so the purpose is to identify such characteristics. The use of
such terms, in search queries, is intended to improve search results, so the study also aims to
establish, in all respective databases, how the retrieval is improved if the subject headings are used
along with some selected text fields, such as abstracts and document titles.
2
Materials and Methods
2.1
Brief general description of information systems under study
The databases and respective hosts (interfaces or platforms) which were employed in this paper are
concisely presented in Table I. Databases are arranged according to the particular host where the
database was accessed for the purposes of this study. Some databases are compiled by professional
associations, agencies, or publishers (for example, ERIC (Educational Resources Information
Center), PsycINFO, Medline); some others are a product of information services which frequently
also provide access to other databases (for example, Ebsco, Proquest). In Table I, we also present
the major subject focuses of databases. More information is widely available on database-related
web pages. The host-database abbreviations (for example C-Eric for CSA Illumina ERIC
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
(CSA/Cambridge Scientific Abstracts) are later used throughout the text and also in tables and
Figures for the purposes of conciseness. We provide the original names of structured vocabularies
or glossaries – thesauri – in each database. We also use the names for subject headings (controlled
terms) as they are labelled in respective hosts (for example, descriptors, subject terms, heading
words, controlled terms). Namely, subject-heading names are database- and host-specific and are
thus represented by different search-field names. In the subsequent text, however, we will
frequently refer to all controlled-terms as subject headings. We also provide the approximate
number of (total) records which could be retrieved in respective databases in the period 2000–2010.
It is becoming increasingly difficult to obtain up-to-date information on the total numbers of records
in databases and information systems. Some databases are gradually being updated with older or
archival records, so the numbers provided in Table I are approximate. Official fact-sheets
sometimes omit the information on total numbers of records and prefer to offer only information on
the coverage of journal titles and other publications. Many information systems increasingly
provide access to full-text journals and sometimes also newspapers, general magazines, wire feeds,
and other types of documents. This is especially the case with the very high number of records in
the P-ABInf.
Table I. Database hosts (platforms), databases, thesauri with the number of subject headings, and
total number of database records in the period 2000–2010
C-Eric
C-Lisa
C-SocAb
Eb-ASC
Eb-Medl
Eb-PSC
Ei-Comp
Ei-Insp
O-PsyInf
P-ABInf
CSA Illumina
ERIC - education, schools and teaching ...
ERIC Thesaurus (descriptors): 6,000 preferred, 4,500 non-preferred terms
LISA - library and information science ...
LISA Thesaurus (descriptors): over 6000 total terms (exact data are not available)
Sociological Abstracts - social and behavioral science ...
Thesaurus (descriptors/Soc. Indexing Terms): 4088 preferred (1456 top terms:), 2739 nonpref. terms
Ebsco
Academic Search Complete - general scientific database ...
Thesaurus (subject terms): 196,000 preferred and 204,000 non-pref. terms
Medline biomedicine (incl. veterinary science) ...
MeSH Thesaurus (Medical Subject Headings ): 26,142 preferred, 177,000 non-pref./entry
terms
Political Science Complete - political sciences, legislation ...
Political Science Thesaurus (subject terms): 7,366 preferred, 10,175 non-pref. terms
Ei (Engineering Village)
Compendex - chemical, civil, electrical, mechanical engineering ...
Ei Thesaurus (Ei controlled terms): 10,200 preferred, 9,420 non-pref. terms
Inspec - physics, electrical engineering & electronics ...
Inspec thesaurus (Inspec controlled terms): 9,573 preferred, 8,826 non-pref. terms
OVID
PsycINFO - psychology and related disciplines
Thesaurus of Psychological Index Terms (heading words): 5,613 preferred, 2,609 non pref.
terms
ProQuest
ABI/Inform Global - business, economics, management ...
ProQuest Thesaurus (subject headings):11,000 preferred, 5,600 non-pref. terms
Records
348,000
127,000
365,000
13,367,000
7,108,000
556,000
6,933,000
5,402,000
1,262,000
28,511,000
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
2.2
Characteristics of selected thesauri
Thesauri which are available through different hosts (platforms) show some distinctive features
which are common in each particular platform. Searches based on subject headings may typically
be conducted in three different ways: with the use of a drop-down menu, where a search field can
be selected from the available list of fields, with the use of a search command, where a particular
field code must be used, and also directly from a thesaurus, which must usually be accessed
separately, for example from a Search-Tool tab. These three methods show some important
differences. The drop-down lists usually offer only a limited number of fields. This is especially a
problem with subject headings. Many databases employ several different subject-heading fields but
the drop-down list may not include all of them. Moreover, heading fields may be word-indexed or
phrase-indexed but such information is usually not provided on a drop-down list. All these different
fields, however, can be used in a command search mode. But the users need to be familiar not only
with these field possibilities but also with the indexing characteristics of a particular database as
well as with the differences in word/phrase indexing which can play an essential role in retrieval.
Finally, subject headings may be selected from a thesaurus tab. But here also some significant
differences among databases and hosts are evident. Thesaurus-based search in some databases may
offer the utility of a so called 'explode' function, which automatically retrieves all narrower terms.
But not all thesauri offer this utility. Moreover, some databases and hosts also include the utility of
stemming (lemmatization) when subject headings are used for retrieval.
In the following account of thesauri, we provide some explanation of the different utilities available
in the thesaurus tab in each respective database/host. Permuted word-indexing comes about under
different names, such as 'Rotated Index', 'Term Contains', or 'Permuted Index'. In some thesauri,
such a permuted index will show preferred-terms only, but in other thesauri it will, for example,
also display non-preferred terms. The acronyms UF (Used For), BT (Broader Term), NT (Narrower
Term), and RT (Related Term) will be used throughout the text. Tree structures based on the subject
heading agriculture are presented in Subsection 3.1.1. Some possible shared middle- or leaf-terms,
stemming from agriculture, are also identified. The number of subject headings containing the
permuted terms (word index) agriculture and agricultural, in respective databases, is presented in
Subsection 3.1.2. Some For the reasons of clarity and consistency, the headings are not capitalised
in the text.
CSA Illumina thesauri
1. Alphabetical List – list of terms with no indication of relationships; all preferred and nonpreferred terms (which begin with particular letters) will be listed.
2. Hierarchy – list of preferred terms with relationships. Complete terms (subject headings) need to
be used. The term agriculture will retrieve only the heading agriculture. Alternative agriculture
will not be retrieved. Agricultural will not retrieve anything. A complete heading must be entered,
such as agricultural economics.
3. Rotated Index – list of all preferred and non-preferred terms that contain a particular word, but
not if it occurs in BTs or NTs of the term. RTs are also shown below a heading term. Both
alternative and agriculture will retrieve alternative agriculture. The symbol [+] after a term
suggests that this contains further narrower terms.
Explode – this function includes all narrower terms in database searches.
Ebsco thesauri
Ebsco uses a complex system of descriptors which can come about as major headings accompanied
with specific qualifiers or subheadings. Dozens of agriculture-related individual descriptors exist,
such as agriculture & energy (BT: power resources) and agriculture–social aspects (BT: social
change), which are not linked to agriculture with either a BT or an RT.
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
1. Term Begins With – alphabetic list of preferred and non-preferred terms.
2. Term Contains – all preferred and non-preferred terms will be listed, including those terms where
agriculture appears in an annotation/scope note, BTs, and UFs (but not if it appears in RTs or
NTs of the heading term).
3. Relevancy Ranked – similar function to 'Term Contains', but will also retrieve some other terms
which are related to the term agriculture (e.g. agricultural).
The Explode function includes all narrower terms in database searches. Ebsco MeSH has some
additional utilities: the symbol [+] before a term implies further NTs. It is also possible to limit
searches to a heading as a major topic.
Ei – Engineering Village thesauri
1. Search – a list of preferred terms and also those where agriculture appears in BTs, NTs, RTs, and
UFs (UFs are italicised).
2. Exact Term – only the exact phrase, for example agricultural products but not agricultural.
3. Browse – alphabetical list of all preferred and non-preferred terms.
In order to explode-search, the narrower terms (first NT level) need to be appended by manual
selection of terms in the thesaurus.
OVID thesauri
1. Thesaurus – alphabetic list of all preferred and non-preferred terms. Workers will not return
agricultural workers but will return 'personnel Used For workers'.
2. Permuted Index – list (word-indexed) of all preferred and non-preferred terms.
3. Scope Note – phrase indexed – will return only those exact headings or non-preferred terms
which have a scope note. Heading workers will retrieve the preferred term personnel but not
agricultural workers.
Explode – this function includes all narrower terms in database searches.
ABI/Inform Global (ProQuest) thesauri
ProQuest products use ProQuest Thesaurus; other database-specific thesauri are available through
the ProQuest platform, such as MeSH.
1. Contains Word(s) – permuted list of preferred and non-preferred terms, where agriculture is a
part of a descriptor.
2. Begins With – alphabetic list of preferred and non-preferred terms.
In order to explode-search, the narrower terms (first NT level) need to be appended by manual
selection of terms in the thesaurus.
2.3
Search syntax
In the previous subsection we presented some specifics of identification of thesaurus terms in a
separate Thesaurus tab in respective databases. This utility can also be used for retrieval of database
records. However, retrieval based on a thesaurus tab will usually retrieve an exact thesaurus term
only. For example, agriculture will only retrieve those records which contain this exact heading.
Sustainable agriculture will not be retrieved. In our subsequent analysis, we wished to identify all
possible subject headings containing the terms agriculture and agricultural. In continuation, we
assessed retrieval, based on either term, as well as retrieval based on the truncated agricultur*.
Retrieval of some selected middle- or leaf-terms was also assessed. These results are presented in
Subsection 3.1.2.
To offer an additional perspective of the use of subject headings in respective databases, we
compared retrieval with the word-stem agricultur* in Subject-Heading database fields and retrieval
with this word-stem in the database fields Document-Title and Abstract. In Table II we present the
search syntax which served as a basis for the results presented in Subsection 3.2. We employed such
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
field codes for subject headings which retrieved records on the principles of a permuted wordindexing, offering a possibility for a more consistent comparison of databases. Title and Abstract
fields are used consistently among databases. Field-name prefixes, as presented in Table II, are
usually not case-sensitive, with the exception of databases hosted by Ebsco. Search (Boolean)
operators are not case sensitive.
Table II. Queries in respective databases, based on Boolean OR, in the database fields Subject
Headings (permuted index: DE/SU/MW/CV), Abstract (AB), and Title (TI)
Database
C-Eric
C-Lisa
C-SocAb
Eb-ASC
Eb-Medl
Eb-PSC
Ei-Comp
Ei-Insp
O-PsyInf
P-ABInf
Subject headings
DE=agricultur*
DE=agricultur*
DE=agricultur*
SU agricultur*
MW agricultur*
SU agricultur*
agricultur* WN CV
agricultur* WN CV
agricultur*.HW
SU(agricultur*)
Search Syntax
Abstracts or Titles
AB=agricultur* or TI=agricultur*
AB=agricultur* or TI=agricultur*
AB=agricultur* or TI=agricultur*
AB agricultur* or TI agricultur
AB agricultur* or TI agricultur*
AB agricultur* or TI agricultur*
agricultur* WN AB or agricultur* WN TI
agricultur* WN AB or agricultur* WN TI
agricultur*.AB or agricultur*.TI
AB(agricultur*) or TI(agricultur*)
3
Results and Discussion
3.1
Characteristics of thesauri and use of subject headings
3.1.1
Tree structures based on the subject heading 'agriculture'
The first part of our study involved assessment of thesauri and the respective tree structures of
subject headings and analysis of hierarchical (Broader Terms, Narrower Term), associative (Related
Term), and preferential relations (Used For) of a general subject heading agriculture. In Table III,
and subsequently Figure 1, we present simplified tree structures and numbers of terms which are
linked to agriculture in respective database thesauri. There is no such heading in O-PsyInf. Also, in
Eb-PSC there is no independent heading agriculture-only. However, in Eb-PSC there are eight
multi-word headings containing agriculture, many supplied with additional qualifiers.
Table III. Tree Structures for the heading agriculture and number of terms in respective thesauri
C-Eric
C-Lisa
C-SocAb
Eb-ASC
UF (11): agcl safety, agcl science, agcl sciences, agcl supplies, agcl trends, agriscience, farm
supplies, feed industry, feed stores, fertilizers, livestock feed stores BT1 (1): technology
NT1 (4): agronomy, animal husbandry, gardening, horticulture
RT (19): agribusiness, agcl colleges, agcl education, agcl engineering … natural resources,
ornithology, rural sociology, seasonal employment, veterinary medicine
UF (1): farming / BT1 (1): food industry
NT1 (6): agcl economics, agcl engineering, alternative agriculture, animal husbandry, horticulture
[1], tropical agriculture / RT (1): farmers
UF (2): agronomy/agronomists, farming / BT (0)
NT1 (2): animal husbandry, part time farming
RT (24): agrarian societies, agrarian structures , agribusiness, agcl development , agcl economics …
industry, land use, plants (botanical), rural areas, soil conservation
UF (2): farming, husbandry /
BT1 (2): industrial arts, life sciences
NT1 (106): acclimatization (plants), aerial photography in agriculture, aeronautics in agriculture,
agcl ability, agcl chemistry … volunteer workers in agriculture, water in agriculture, wetland
agriculture, women in agriculture, zinc in agriculture
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
RT (24): agrarian societies, agcl colleges, agcl education, agcl exhibitions, agcl extension work …
land use, rural, physiocrats, rural industries, 'sociology, rural'
UF (2): agcl development, agcl workers / BT1 (1): 'technology, industry, and agriculture'
NT1 (9):agcl irrigation, animal husbandry, aquaculture [1], beekeeping, dairying, gardening,
hydroponics, organic agriculture, weed control / RT (0)
no single heading agriculture; 8 multi-word headings containing agriculture
UF (2): agcl applications, limestone--agcl applications / BT1 (1): industry
NT1 (8): agcl products [11], agronomy, crops, cultivation, farms [3], forestry [3], harvesting,
irrigation [2]
RT (14): agcl chemicals, agcl engineering, agcl machinery, agcl runoff, animals … nitrogen
fertilizers, orchards, rural areas, soil conservation, veterinary medicine
UF (0) / BT1 (1): farming / NT1 (1): irrigation
RT (19): agcl engineering , agcl machinery , agcl pollution , agcl products , agcl safety
agrochemicals … natural resources, organic farming, pest control, soil, vegetation mapping
no headings, containing agriculture
UF (0) / BT (0)
NT (17): agribusiness , agcl banking, agcl biotechnology , agcl checkoffs , agcl economics …
planting , selective breeding , sustainable agr , tillage , urban farming
RT (21):agcl commodities, agcl education, agcl engineering, agcl lending, agcl management …
irrigation, organic farming, pastures, plantations, soil fertility
Eb-Medl
Eb-PSC
Ei-Comp
Ei-Insp
O-PsyInf
P-ABInf
The heading agriculture is organised quite differently in different thesauri, and has considerably
different BTs, NTs, RTs, and UFs. C-Eric has as many as 11 UFs, but some thesauri have no UFs
(Ei-Insp, P-ABInf). BTs also differ among thesauri, ranging from technology to food industry and
industrial arts. The greatest difference is exhibited in NTs. Ei-Insp has only one NT, but Eb-ASC
has as many as 106 NTs, many of those quite specialised, for example volunteer workers in
agriculture. In Figure 1 we limited the X-axis to 25 because no other heading-group exceeded 24
terms. Eb-Medl has no RTs, but Eb-ASC and C-SocAb both have 24 RTs. Interestingly, among the
24 RTs, these two thesauri share only two RTs: agrarian societies and agricultural technology.
Table III shows only the first and the last five terms if more than 10 different headings exist in a
particular group. The terms agriculture (agr) and agricultural (agcl) are abbreviated.
C-Eric
19
4
1
11
24
2
C-SocAb 0
RT
2
Eb-ASC
1
Ei-Comp
1
Ei-insp
0
0
0
O-PsyInf 0
0
9
UF
2
14
8
2
19
1
1
21
17
P-ABInf 0
0
0
106
BT1
0
Eb-Medl
24
NT1
2
2
5
10
15
20
25
Figure 1. Number of non-descriptors (UF), broader terms (BT), narrower terms (NT), and related
terms (RT) in the respective database thesauri for the subject heading agriculture
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
To offer some additional perspective, we also provide information on possible middle- or leafterms, stemming from agriculture. The terms are quite different and strongly database- as well as
host-specific. In most cases, the term agriculture yields only one-level narrower terms, such as
animal husbandry (or animal breeding) and horticulture/gardening which are thus, technically,
leaf-terms. The term agriculture is linked to several different broader terms, for example
technology, industry or life sciences, depending on the particular thesaurus. In tables IV and V we
present the most frequently occurring narrower terms, which are to some extent shared among
thesauri, but even these terms cannot be found in all thesauri. In three thesauri (Table IV), no exact
relevant terms are available for the concepts of animal husbandry or animal breeding. We have also
checked for some other possible terms, such as animal- or livestock husbandry or production. Those
thesauri which do contain applicable terms have no, or very few, further narrower terms. The
related terms cattle or livestock (or farm animals) are contained only in five among the ten thesauri.
We also investigated the term animals. This zoological term, however, is too general, and can not
represent sufficiently the concept of farm animals, in the sense of agriculture. Even so, in C-Eric
the term animals does stand as a heading for livestock, along as many as 61 other non-preferred
terms (UFs). Two other database thesauri, Eb-ASC and P-ABInf, organise such terms as NTs. The
preferred term cattle occurs only in a couple of thesauri. The only database which offers a more
complex animal-related hierarchy is Eb-Medl, on account of its coverage of veterinary medicine. In
Tables IV and V we present only one of all possible NTs or UFs (the first on the hierarchical list in
each thesaurus).
Table IV. Tree structures for the headings related to animal husbandry/ animal breeding and
livestock (animals) and number of non-preferred- and narrower terms in respective thesauri
C-Eric
C-Lisa
C-SocAb
Eb-ASC
Eb-Medl
animal husbandry/
animal breeding
animal husbandry
UF (4) animal science / NT (0)
animal husbandry
UF (0) / NT (0)
animal husbandry
UF (2) animal breeding
NT (0)
animal breeding
UF (1) domestic animals-breeding
NT (4) animal mutation breeding
animal husbandry
UF (0) / NT (0)
Eb-PSC
0
Ei-Comp
0
Ei-Insp
0
O-PsyInf
animal breeding
UF (1) breeding (animal)
NT (1) selective breeding
breeding of animals
UF (1) animal breeding / NT (0)
P-ABInf
1. livestock/cattle
2. animals
1. 0
2. animals UF (61) animal caretakers … livestock / NT (0)
1. 0
2. animals UF (1) fauna / NT (0)
1. livestock UF (3) cattle / NT (0)
2. animals UF (2) fauna / NT (3) livestock
1. livestock UF (3) live stock / NT (1) cattle
2. animals UF (7) animal kingdom / NT (76) animal diversity
1.1 livestock
1.2 cattle UF (6) bos grunniens
2. animals UF (2) animalia / NT (3) *
1. 0
2. 0
1. 0
2. animals UF (0) / NT (4) birds
1. 0 USE farming
2. 0
1. cattle UF (2) bulls
2. animals UF (0) / NT (5) female animals
1. livestock UF (0) / NT (7) alpaca … cattle
2. animals UF (0) / NT (76) alpaca … cattle
The terms gardening/horticulture and crops/plants, similar to animal husbandry and livestock, are
likewise not very systematically included in these thesauri (Table V). If they do exist, they are
already final or leaf-terms. It is worth pointing to database thesauri C-Eric and Eb-ASC, with regard
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
to the terms gardening and horticulture. C-Eric contains 8 and 12 respective UFs, and no NTs. But
Eb-ASC, conversely, contains as many as 55 and 33 respective NTs but only a single non-preferred
term. Crops are included only in five thesauri, but except for two thesauri, Eb-ASC and P-ABInf,
the crops have no further NTs. There are some NTs in the case of plants, for example, in the PABInf thesaurus. But plants, like animals, can not serve as a precise descriptive term for
agriculture-related concepts. The huge database P-ABInf is a special case, as has been explained. In
this database, there exist two headings, flowers & plants and crops. Practical distinction, however,
is not clear, despite the scope-description in the pertaining thesaurus. Canola and cassava, for
example, are NTs under both headings. But most agricultural crops, such as cocoa, coffee, cotton,
are subordinate only to flowers & plants and not crops. The heading crops has only five NTs. Just
as was the case with animals, the headings plants are present more consistently, but this botanical
heading, again, is too general to represent agricultural concepts. Just like animals. However, in
concurrence with the heading agriculture, such general terms can still provide some indexing
information in a non-agricultural database.
Table V. Tree structures for the headings related to gardening/horticulture and crops (plants) and
number of non-preferred- and narrower terms in respective thesauri
C-Eric
C-Lisa
CSocAb
Eb-ASC
Eb-Medl
Eb-PSC
Ei-Comp
Ei-Insp
O-PsyInf
P-ABInf
3.1.2
1. gardening
2. horticulture
1. gardening UF (8) gardeners / NT (0)
2. horticulture UF (12) crop planting / NT (0)
1. gardening UF (0) / NT (0)
2. horticulture UF (0) / NT (1) plants
1. gardening UF (0) / NT (0)
2. 0
1. gardening UF (1) bedding (horticulture)
NT (55 !) acclimatization (plants)
2. horticulture UF (1) horticultural science
NT (33 !) acclimatization (plants)
1. gardening UF (0) / NT (0)
2. 0
0
0
1. gardening UF (0) / NT (0)
2. horticulture UF (0) / NT (0)
0
1. gardens & gardening UF (1) gardening
NT (4) botanical gardens
2. horticulture UF (0) / NT (0)
1. crops
2. plants
1. 0 (USE Agronomy)
2. plants (botany) UF (7) plantae / NT (0)
1. 0
2. plants UF (1) flora / NT (0)
1. 0
2. plants (botanical) UF (3) flora / NT (0)
1. crops UF (4) agricultural crops / NT (18) cash crops
2. plants UF (2) vegetable kingdom
NT (121 !) acid-tolerant plants
1. agricultural crops UF (0) flora / NT (1) animal feed
2. plants UF (0) flora / NT (11)
energy-crops only!
1. crops UF (1) farms-crops / NT (0)
2. plants (botany) UF (1) farms-crops / NT (6) algae ...
1. crops UF (1) genetically modified crops / NT (0)
2. 0
1. 0
2. plants (botanical) UF (0) flora / NT (0)
1. flowers & plants UF (1) Plants / NT (35) algae ...
2. crops UF (2) agricultural crops / NT (5) barley ...
Retrieval of 'agriculture' and 'agricultural', based on permuted subject headings
This part of the study tackled the identification of all individual subject headings (preferred terms)
containing the words agriculture and agricultural in respective thesauri, with the aim of
subsequently retrieving database records with these words. Thesauri also vary quite significantly
with regard to the inclusion of these two terms. In C-Eric, C-SocAb, Ei-Comp, and Ei-Insp there is
only one term containing agriculture, but, again, in Eb-ASC, there are as many as 121 different
terms (Figure 2). As other databases contain far fewer terms, the X-axis in Figure 2 is again limited.
No O-PsyInf headings contain agriculture, and only two contain agricultural. But there are as many
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
as 213 terms that contain the word agricultural in Eb-ASC. Some headings are deeply specialised,
for example women agricultural laborers. This is reflected in the very high number (196,000) of
total preferred terms in the Eb-ASC thesaurus, as was presented in Table I.
In continuation, we used both subject-heading words (agriculture and agricultural) to retrieve the
records in the respective databases for the period 2000–2010 using the field commands based on the
permuted index (Figure 3). We employed a command line search syntax and word-indexed subjectheading search fields (DE/SU/MW/CV), presented in Table II. Additionally, we compared the
retrieval results of the terms agriculture and agricultural with the retrieval using a truncated stem
agricultur*.
The results in P-ABInf need explanation. Almost three times as many records are retrieved with
agriculture than with agricultural. In this database, it is not possible to retrieve records based on the
principles of strict subject-headings only. Namely, a search for agriculture retrieves not only the
headings based on this word but also all records classified with 'Classification Code 8400:
Agriculture Industry'. Classification of database records with this code accounts for more than
100,000 records. The use of the term agriculture thus retrieves as many as 142,000 records, and
agricultural 'only' 49,000. With these rather high numbers it is necessary to take into account that
as many as 28,511,000 records were indexed by this database in the period 2000–2010. Namely,
this database indexes, besides scholarly articles, many millions of items from newspapers, trade
journals, wire feeds, and so on.
In Eb-ASC, Eb-Medl, Ei-Insp, and P-ABInf more records are retrieved with headings containing
agriculture, whereas in C-Eric, C-SocAb, Ei-Comp, and O-PsyInf more records are retrieved with
agricultural. But these two concepts are very similar. For example, agriculture could quite easily be
substituted with agricultural in many headings, such as agriculture libraries. We thus believe that
use of the word-stem agricultur* (right-hand truncation) in a search is a preferred procedure to
retrieve such records. Ei Electronic Village databases and thesauri (Inspec and Compendex) need
comment. The search also employs a default automatic lemmatization or stemming (autostemming)
when specific subject headings are used as a search criterion, so it makes no difference whether
retrieval is performed with agriculture or with agricultural. We thus switched off autostemming in
order to ascertain the exact number of documents indexed with either term.
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
C-Eric
8
1
4
C-Lisa
C-SocAb
6
14
1
213
121
Eb-ASC
4
4
Eb-Medl
Eb-PSC
22
8
Ei-Comp
1
Ei-Insp
1
O-PsyInf 0
P-ABInf
7
agricultural (in SH)
5
agriculture (in SH)
2
16
4
0
10
20
30
Figure 2. Number of subject headings containing the permuted terms (word index) agriculture and
agricultural in respective databases
C-Eric
C-Lisa
C-SocAb
70300
Eb-ASC
Eb-Medl
SH=agricultur*
SH=agricultural
SH=agriculture
Eb-PSC
Ei-Comp
Ei-Insp
O-PsyInf
130300
P-ABInf
113400
0
10000
20000
30000
40000
50000
Figure 3. Retrieval with the permuted terms (word index) agriculture, agricultural, and agricultur*
(truncated) in subject heading fields in respective databases in the period 2000–2010
Again, for additional perspective, we also assessed retrieval if based on some selected middle- or
leaf-terms which were identified in tables IV and V. It was not possible to find enough analogous
terms, applicable to all thesauri under study. However, in some thesauri the terms
gardening/horticulture and animal husbandry do occur, so we used those terms to illustrate results
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
of a possible subject-heading-based search query (Table VI). The inter-database results differ
considerably, so a table is presented instead of a figure.
Table VI. Retrieval with the terms related to animal husbandry and gardening or horticulture in
subject heading fields in respective databases in the period 2000–2010
Datab
ase
C-Eric
CLisa
CSocAb
EbASC
EbMedl
EbPSC
EiComp
EiInsp
OPsyInf
PABInf
Subject
headings
animal
husbandry
animal
husbandry
animal
husbandry
animal breeding
animal
husbandry
0
O
cc.
8
7
2
Subject headings
gardening OR horticulture
Oc
c.
26
7
28
1
59
3
459
6
998
gardening OR horticulture
91
gardening OR horticulture
91
19
28
9
gardening OR horticulture
gardening
0
0
0
0
gardening OR horticulture
animal breeding
breeding
animals
of
1
766
2
928
32
8
0
"gardens
horticulture
&
gardening"
OR
19
518
The first three databases yield very few records, what is also consistent with the much smaller size
of those databases. Still, only two records have been indexed with animal husbandry in C-Lisa. In
Medline, the high number of such records is consistent with strong veterinary coverage. In this
database many other applicable headings are also available. Some close inspection of animal
breeding-related bibliographic records in O-PsyInf, however, shows that this heading applies to
very general zoological topics. Indeed, the scope note reveals that in this psychological database,
this subject heading refers also to the "propagation (or reproduction) of a species in its natural
environment". Most records are thus related to the aspects of ecology of wild animals rather than
agriculture. In P-ABInf, on the other hand, a huge numbers of records are retrieved with the
headings gardens & gardening and horticulture. Most of those records have been indexed with
gardens & gardening. It has been explained that this is a particular database. Indeed, most of these
record types have been classified as pertaining to Wire Feeds and Trade Journals. Among the more
than 19,500 records (Table VI), only 177 documents have been categorised as Scholarly Journals.
3.2
Comparison of retrieval with subject headings, titles, and abstracts
In the final part of the study, we wished to ascertain how much we can improve retrieval if we
search subject headings in combination with uncontrolled free-text terms. All databases offer a
possibility of retrieval with different topic fields, such as titles, abstracts, and subject headings. In
most databases it is also possible to search for records on even broader free-text principles, based on
a particular code, for example TX-Text, or by simply inserting a search term in the command line,
without a particular field code. Such text-retrieval possibilities, however, are not only different
among databases, but are also strongly host-specific. For example, in some databases, such a freeterm-search, or text-search, will only be conducted in the 'topics' fields, such as abstracts, titles, and
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
subject headings. In some other databases, however, the terms will also be retrieved from the
author's address fields, journal titles, and so on. But in some systems the terms will also be retrieved
in the full-text of a document, even if a full-text format is not available to a subscriber. Very general
free-text retrieval can therefore not serve as a consistent basis for comparison of subject coverage in
different databases. We thus employed only abstracts and document titles which are organised on
rather consistent principles in all databases. Again, we used the word-stem agricultur*. Search
syntax, which was used as a basis for Figures 4 and 5, was presented in Table II.
In assessing the results in Figure 4, it is necessary to take into consideration the total number of
each respective database record in the period 2000–2010 (Table I). In C-Lisa, only 680 records
were retrieved, using the search query (DE =agricultur* or AB=agricultur* or TI=agricultur*).
But in this database, only 127,000 total records were indexed in this period. This was also the case
with the other two specialised CSA Illumina databases under study (Eric and Sociological
Abstracts). There are also not many agriculture-based documents in O-PsyInf, even though this
database contains more than a million records from the period 2000–2010. But this is a psychologyfocused database, not quite related to the agricultural subjects. On the other part of the scale, as
many as 271,000 records were retrieved with the syntax (SU(agricultur*) or AB(agricultur*) or
TI(agricultur*)) in P-ABInf. But then, as many as 28,511,000 total records were added to this
database during this period, as this database indexes millions of short articles from daily
newspapers and trade journals.
C-Eric
C-Lisa
C-SocAb
113900
Eb-ASC
Eb-Medl
Eb-PSC
AB or TI or SH
Ei-Comp
AB or TI
SH
Ei-Insp
O-PsyInf
271000
166200
130400
P-ABInf
0
10000
20000
30000
40000
50000
60000
70000
80000
Figure 4. Retrieval of records with the term agricultur* in word-indexed subject headings (SH), in
abstracts or titles (AB or TI), and in abstracts, titles, or subject headings (AB or TI or SH), in
respective databases in the period 2000–2010
Abstracts and titles contain concepts from natural language, which can be represented by many
synonymous, associated, or narrower terms, so the main purpose of controlled subject headings is to
improve retrieval on some consistent principles. We determined, as a percentage, how much the
retrieval was enhanced by the use of subject headings along with abstracts and titles in comparison
with retrieval using abstracts and titles alone. Figure 5 shows how much more material can be
retrieved if a subject heading is included in a query, using word-indexed subject-heading fields.
Five databases provide 50% more records if the subject-heading agricultur* is also used. En-Medl,
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
Ei-Insp, and P-ABInf provide as many as 60% more records. Here we need to emphasise again that
several different search modes can be used in databases. Greater consistency can only be achieved if
similar search queries are employed, such as those based on a command that will retrieve records on
the principles of a permuted word-indexing. Some other commands which employ phrase-indexing
will retrieve far fewer results. This is frequently the case with the field options available through
drop-down menus. The use of agriculture as a search term may then miss relevant descriptors, such
as alternative agriculture, sustainable agriculture, and many more. So command language, and a
pertinent search syntax (query) should preferably be used. It needs to be reiterated that this is only a
model based on a simple and very general term agricultur*. All databases under study are subjectspecific. Many other agriculture-related subject terms are available in respective thesauri. Some
relevant subject headings, for example, are constructed from the adjective rural, which is, however,
also a sociological concept. There also exists another, relatively analogous term - farming -which
could also had been examined. However, other, non-agricultural concepts can be associated with
this concept, for example wind farming or gold farming, which would seriously distort any possible
comparison among databases. Also, in some databases, thesaurus-based retrieval offers a possibility
to 'explode' a search term, so records will also be retrieved which may have been indexed only with
a more precise agriculture-related term. In such a case, the term agriculture itself does not need to
be used as an indexing term, but the records will still be retrieved. The retrieval will thus depend on
the complexity and hierarchical system of a thesaurus, confirming the significance of structured
thesauri in the discovery of knowledge. The aim of this study was thus also to show how difficult it
is to consistently compare subject-based retrieval in different information systems, based on
different indexing as well as retrieval principles. End-users need to be aware of such characteristics
and specifics if they wish to use the existing information systems to the full.
C-Eric
24
C-Lisa
10
C-SocAb
21
Eb-ASC
54
Eb-Medl
62
Eb-PSC
12
Ei-Comp
57
Ei-Insp
64
O-PsyInf
23
P-ABInf
63
0
10
20
30
40
50
60
70
Percentage (%)
Figure 5. Enhancement of retrieval (as a percentage) with word-indexed subject headings in relation
to retrieval with free-text terms in abstracts or titles, based on the term agricultur*, in respective
databases in the period 2000–2010
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
4
Conclusions
Agriculture-related research is systematically compiled in three large databases: Agris, Agricola,
and CAB Abstracts. These three databases, however, may fail to collect many interesting
documents that may have been published in other, non-agricultural publications, especially journals.
Such documents may be indexed by other subject-specific databases, which can be overlooked by
end-users seeking agricultural information. For example, important information on food, nutrition,
medicinal plants, and animal health can be found in Medline (or associated PubMed), compiled by
the National Library of Medicine (NLM). Information on agricultural engineering is compiled by
Compendex. Agricultural-economics-related information is scattered across business and
economics databases. Rural studies are indexed by social-sciences-related information systems. The
main purpose of the study was thus to assess the utility of such non-agriculture-specific information
systems and respective controlled vocabularies in the retrieval of agricultural information, based on
some selected similar or shared terms, occurring in all thesauri. There are very few further middleor leaf-terms, stemming from the broader term agriculture. And such terms are usually not present
in all thesauri. Only a general term (heading) agriculture is included in all respective thesauri so we
used this term as a model for a more thorough parallel comparison on similar principles. The
numbers of subject headings that contain the words agriculture and agricultural in different thesauri
are quite different. Thesauri also differ in the structure (tree-structures) of respective non-preferred,
broader, narrower, or related terms. Some thesauri include many non-preferred terms, some offer
none. The numbers of related terms also differ quite significantly. The greatest differences are
exhibited in the use of narrower terms. Some multiple-word subject headings can be very complex
and precise, and can include several different words. In many such compound subject headings, the
terms agriculture or agricultural can substitute for each other so it is preferable to employ a
truncated term, based on the word-stem agricultur*, in order to enhance the retrieval. But the use of
such fields is database- as well as host-specific. Retrieval can be further complicated by the fact that
there exist search techniques and fields which are based either on the principles of word-indexing or
phrase-indexing. An uninformed use of phrase-indexing may seriously diminish retrieval results.
Also, drop-down menus usually offer fewer retrieval possibilities than command search. There is
seldom the need to retrieve the exact agriculture-only indexed records. It is thus preferable to select
a search command that retrieves all permuted occurrences of this term. Default stemming or
autostemming (lemmatization), which is employed in some databases, returns more results, which
are, however, less precise. Such a function may thus hinder comparison of retrieval results in
different databases. The explode function, which is available in some systems, additionally
complicates uniform comparison among different thesauri and databases. This function, however,
may be very useful for end-users, provided that a particular thesaurus is based on an expert
hierarchical tree-structure. Free-text search is based on many different principles in different
databases and search-platforms. An identical command will in some databases retrieve only
concepts from the document-titles, abstracts, and subjects headings. In some other databases,
however, the entire bibliographic record, and sometimes even the full text of a document, will be
searched. Some uniform comparison of databases can only be achieved through titles, abstracts, and
subject headings. But even subject-heading searching is not uniform. Subject-heading command
will usually retrieve only controlled terms - descriptors - from a thesaurus. In some cases, however,
non-controlled identifiers, other non-thesaurus-based headings (for example, geographic
descriptors), or even the concepts in classification codes which are not a part of a particular
thesaurus will also be retrieved. In five of the ten databases under study, subject headings improved
recall by 50% or even 60%. Employment of the function 'explode' would enhance retrieval even
more, as frequently only more specific indexing terms are used in databases. This needs to be
explored in our further study.
All respective thesauri in the analysis include several indexing terms which are agriculture-specific
and which enhance retrieval. All databases contain relevant records but all retrieval results must be
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
interpreted with some caution. Some specialised databases are small, so they contain fewer such
records, accordingly. Some other databases, however, are very large indeed, and can be expected to
index many unique agriculture-related documents, for example in the fields of economics or
machinery. Many such documents would possibly not be retrieved in agriculture-specific databases.
The identification of unique records, however, also remains an object of our further research. The
above databases can serve as a valuable source of agriculture-related information but end-users
must be careful in the formulation of queries, so more activities are recommended in terms of enduser information literacy and competencies. Differences among search systems, databases, and
controlled vocabularies are quite important and therefore demand a serious and systematic
approach, which may present a challenge in the age of simple-search solutions through Internet
search engines.
5
References
Anghelescu, H.G.B., Yuan, X., Zhang, X. (2005), "Domain knowledge, search behaviour, and
search effectiveness of engineering and science students: an exploratory study", Information
Research. Vol. 10 No.2, pp. 1-20.
Bandyopadhyay, A. (2010), "Examining Biological Abstracts on Two Platforms: What Do End
Users Need to Know?", Science & Technology Libraries, Vol. 29 No.1-2, pp.34-52.
Bartol, T. (2009), "Assessment of classification and indexing of an agricultural journal based on
metadata in AGRIS and CAB Abstracts databases", International Journal of Metadata,
Semantics and Ontologies, Vol. 4 No.1, pp.4-12.
Bartol, T., Hocevar, M. (2005), "The capital cities in the ten new European Union countries in
selected bibliographic databases", Scientometrics. Vol. 65 No.2, pp.173-187.
Brown, J. D. (2003), "The ERIC database: A comparison of four versions", Reference services
review, Vol. 31 No.2, pp.154-174.
Byrne, J. R. (1975), "Relative effectiveness of titles, abstracts, and subject headings for machine
retrieval from the COMPENDEX services", Journal of the American Society for Information
Science, Vol. 26 No.4, pp.223-229.
DeLong, L. (2007), "Subscribing to databases: how important is depth and quality of indexing? ",
Acquisitions Librarian. Vol. 19 No.37/38, pp.99-106.
Gil-Leiva, I., Alonso-Arroyo, A. (2007), "Keywords given by authors of scientific articles in
database descriptors", Journal of the American Society for Information Science and Technology.
Vol. 58 No.8, pp.1175-1187.
Greenberg, J. (2004), "User comprehension and searching with information retrieval thesauri",
Cataloging and Classification Quarterly. Vol. 37 No.3/4, pp.103-120.
Hazman, M., El-Beltagy, S.R., Rafea, A. (2009), "Ontology learning from domain specific web
documents", International Journal of Metadata, Semantics and Ontologies, Vol. 4 No.1/2
pp.24 - 33.
Hood, W. W., & Wilson, C. S. (2001), "The scatter of documents over databases in different subject
domains: how many databases are needed?", Journal of the American Society for Information
Science and Technology, Vol. 52 No.14, pp.1242-1254.
Jacso, P. (1997), "Content Evaluation of Databases", Annual Review of Information Science and
Technology, Vol. 32, pp.231-67.
Jenuwine, E. S., Floyd, J. A. (2004), "Comparison of Medical Subject Headings and text-word
searches in MEDLINE to retrieve studies on sleep in healthy individuals", Journal of the
Medical Library Association, Vol. 92 No.3, pp.349-353.
Knapp, S. D., Cohen, L. B., & Juedes, D. R. (1998), "A natural language thesaurus for the
humanities: the need for a database search aid", The Library Quarterly, Vol. 68 No.4, pp 406430.
Program: electronic library and information systems, Vol. 46 Iss: 2, pp.258 - 276
Manouselis, N., Najjar, J. , Kastrantas, K. , Salokhe, G. , Stracke, C.M. , Duval, E. (2010),
"Metadata interoperability in agricultural learning repositories: An analysis", Computers and
Electronics in Agriculture. Vol. 70 No.2, pp.302-320.
Morris, T. (2001), "Visualizing the structure of medical informatics using term co-occurrence
analysis: II. INSPEC perspective", in Proceedings of the 64th ASIST Annual Meeting in
Washington D.C., USA, 2001, pp. 489-497.
Morshed, A., Johannsen, G., Keizer, J., Zeng, M.L. (2010), "Bridging End Users’ Terms and
AGROVOC Concept Server Vocabularies", in DCMI '10 Proceedings of the 2010 International
Conference on Dublin Core and Metadata Applications in Pittsburgh, USA, 2010, pp.186-189.
Parker, J. E. (1971), "Preliminary assessment of the comparative efficiencies of an SDI system
using controlled or natural language for retrieval", Program: electronic library and information
systems, Vol. 5 No.1, pp.26-34.
Salisbury, L., Gupta, U. (2004), "A Comparative Review of INSPEC on EBSCOHost, Engineering
Village (EV2), and Institute for Scientific Information (ISI)", Charleston Advisor. Vol. 6 No.1,
pp.5-11.
Savary, A., Jacquemin, C. (2003), "Reducing Information Variation in Text", in Renals, S. and
Grefenstette, G. (eds.), Text- and Speech-Triggered Information Access 8th ELSNET Summer
School, LNCS. vol. 2705. Springer, Heidelberg, pp.145-181..
Sewell, R. R. (2011), "Comparing Four CAB Abstracts Platforms from a Veterinary Medicine
Perspective", Journal of Electronic Resources in Medical Libraries, Vol. 8 No.2, pp.134-149.
Shaw Jr, W. M. (1993), "Controlled and uncontrolled subject descriptions in the CF database: a
comparison of optimal cluster-based retrieval results", Information processing & management,
Vol. 29 No.6, pp.751-763.
Sicilia, M.A. (2008), "Linking learning technology with agricultural knowledge organization
systems", in AgroLT 2008 Workshop on learning technology standards for agriculture and rural
development, September 18-20 September, Athens, Greece, available at: http://infolabdev.aua.gr/agrolt/2008/papers/AgroLT_siciliaInv_f.pdf (accessed 10 November 2011).
Soergel, D., Lauser, B., Liang, A., Fisseha, F., Keizer, J., Katz, S. (2004), "Reengineering Thesauri
for New Applications: the AGROVOC Example", Journal of Digital Information. Vol. 4 No.4
pp. 1-23.
Subirats, I., Onyancha, I., Salokhe, G. and Keizer, J. (2008),"Towards an architecture for open
archive networks in agricultural sciences and technology", Online Information Review. Vol. 32
No.4, pp.478-487.
Walters, W. H., & Wilder, E. I. (2003), "Bibliographic index coverage of a multidisciplinary field",
Journal of the American Society for Information Science and Technology, Vol. 54 No.14,
pp.1305-1312.
Weaver, C.G. (2002), "Gerontology and geriatrics: a multidisciplinary approach to indexing", in
Hornyak, B. (Ed.), Indexing Specialties: Psychology, Information Today & American Society
of Indexers, Medford, Wheat Ridge, pp. 41-48.
Younger, P., Boddy, K. (2009), "When is a search not a search? A comparison of searching the
AMED complementary health database via EBSCOhost, OVID and DIALOG", Health
Information & Libraries Journal. Vol. 26 No.2, pp.126-135.