Semantically Matching Tools and Data Collection Content: A ToolMatch Use Case Extension by Matthew Ferritto A Thesis Submitted to the Graduate Faculty of Rensselaer Polytechnic Institute in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE Major Subject: COMPUTER SCIENCE Approved by the examining committee: Peter Fox Thesis Advisor Deborah McGuinness, Member Jim Hendler, Member Rensselaer Polytechnic Institute Troy, NY November 2014 (For Graduation December 2014) Contents Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Listings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Initial Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Third Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Historical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1 Ontologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 SWRL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Semantic Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3 ToolMatch Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1 ToolMatch Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 ToolMatch Instances . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.3 Semantic Matching of Tools and Data Collections . . . . . . . . . . . 15 4 ToolMatch Web Service . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.1 ToolMatch Web Service (Tools) . . . . . . . . . . . . . . . . . . . . . ii 17 4.2 ToolMatch Web Service (Data Collections) . . . . . . . . . . . . . . . 21 5 Third Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.1 Observation and Measurements Ontology . . . . . . . . . . . . . . . . 24 5.2 Changes in the ToolMatch Ontology . . . . . . . . . . . . . . . . . . 25 5.3 Semantic Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.4 Discussion and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . 29 6 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 iii List of Figures 1 ToolMatch Ontology - Data Collections . . . . . . . . . . . . . . . . . 9 2 ToolMatch Ontology - Tools . . . . . . . . . . . . . . . . . . . . . . . 10 3 ToolMatch Instances - Tool and DataCollection . . . . . . . . . . . . 14 4 Inferencing Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 5 Web Service Tool List . . . . . . . . . . . . . . . . . . . . . . . . . . 18 6 Web Serice Tool Splash Page . . . . . . . . . . . . . . . . . . . . . . . 19 7 Web Service Tool Form . . . . . . . . . . . . . . . . . . . . . . . . . . 20 8 Web Service Data Collection Preliminary Form . . . . . . . . . . . . 22 9 Web Service Data Collection Main Form . . . . . . . . . . . . . . . . 23 10 ToolMatch Updated Ontology - Tool . . . . . . . . . . . . . . . . . . 26 11 ToolMatch Updated Ontology - Data Collection . . . . . . . . . . . . 27 12 ToolMatch Instances - Full . . . . . . . . . . . . . . . . . . . . . . . . 41 13 RDF/XML for Tool Instance (Panoply) . . . . . . . . . . . . . . . . . 42 iv List of Tables 1 ToolMatch Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2 ToolMatch Object Predicates . . . . . . . . . . . . . . . . . . . . . . 12 3 ToolMatch Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . 43 v Listings 1 ToolMatch Schema (N3) . . . . . . . . . . . . . . . . . . . . . . . . . 36 2 Observation XML Example . . . . . . . . . . . . . . . . . . . . . . . 44 vi Acknowledgments I would like to express my gratitude to my advisor, Dr. Peter Fox, who guided me throughout the whole thesis process. His vision, guidance and reassurance were all instrumental in the writing of this thesis. I would also like to thank Dr. Jim Hendler and Dr. Deborah McGuinness, who were both on my Master’s Thesis committee. Additionally, I would like to thank Patrick West, Nancy Hoebelheinrich, and Chris Lynnes for all their contributions, as well as for all of their collaboration on The ToolMatch service. Their enthusiasm and work ethic made it a pleasure to work with them. vii Abstract The ToolMatch service was developed with the intent to provide data users with the means to match their data collections with a comprehensive list of useful, appropriate tools, and to provide data tool developers with data collections that will work with their tools. As such ToolMatch had an initial scope of two use cases, the first of which was the semantic matching of data collections with tools. This would allow data users to find and choose among a list of otherwise separate and potentially hard to find tools that could work with their data collections. The second (and more difficult) of these use cases was the converse: given a tool, semantically find what data collections that the tool can use. If the first use case is analogous to having nails and looking for a hammer, then the second use case can be compared to having a hammer and looking for nails. It is much more difficult to find data collections that may work with a given tool, since a tool user might not necessarily know what to look for. Using the ToolMatch service, a tool user could easily find a data collection to use with their tool. In both of these use cases, wasted time and effort searching for the correct tool or data collection can be reduced or avoided completely. The focus of this thesis will be on the implementation of these two use cases, as well as an extension of the first use case, where a data user with certain semantics for a given data collection (such as a domain model) can find tools that can be used with the content of that data. This is an important issue due to the fact that a certain data collection content may not be appropriate for a tool within a certain domain model. For example, rainfall or topographic data content that is part of a larger Hydrological model can be matched to tools that the model as a whole might not be able to match. This expands the scope of the initial use case in that data collection content requires stricter matching than just the characteristics of data collection. The requirements of this use case involve modification and expansion to the ToolMatch conceptual model and ontology to allow for semantic matching between data content and tools. These viii changes will also be reflected in the ToolMatch web service, which allows users to make add, update, or delete instances of the ToolMatch ontology without having to have a full understanding of ontologies. ix 1 Introduction For a given data collection, it is difficult to find the tools that can be used to work with that data collection. In many cases, the information that Tool A works with Data Collection B is somewhere on the Web, but not in a readily identifiable or discoverable form. In other cases, particularly more generalized tools, the information does not exist at all, until somebody tries to use the tool on a given data collection. Conversely for a given tool, it can be even more challenging to find data collections that work properly with that tool. Out of these two issues sprang two intial use cases for ToolMatch. The simplest and most prevalent use case developed was for the user to find the tools that can be used with a given data collection. For example, if a user has data collections that are accessible via OPeNDAP Hyrax in HDF5 format, how could these data collections be used? What tools are available to work with that data collection, and where could one find them? The second use case is a converse of the first: given a data tool, how can a user find appropriate data collections that the tool will be able to work with? For a given tool, it can be difficult to find data collections that work properly with that tool. A common analogy for the first use case is that someone has a nail, but needs a hammer to that works with a nail. The second use case can be compared to having a hammer and looking for the proper nails that can be used with that hammer. It is much more difficult to find data collections that may work with a given tool, since a tool user might not necessarily know what to look for. Examples of this use case all center around a central premise. Given a tool, how could a user find a data collection with measurements of atmosphereic aerosol optical depth sliced along latitude and longitude, returned as netcdf data, and accessible in MatLab? Which data collections 1 are available in Giovanni? The final and perhaps the clearest example involves three tools: HDFView, Ferret, and Panoply, all of which can display data with different servers and different formats. HDFView can display swath data in DFA as line plots; Ferret can display swath data via OPeNDAP as a Grid; Panoply can dislpay swath data via OPeNDAP on a map. The second use case addresses this issue of finding which data collections can be used by each of these tools. These two use cases formed the backbone of the ToolMatch service. 1.1 Initial Work The first iteration of ToolMatch involved the creation of an ontology based upon the first two use cases, creating a simple set of concepts and relationships in order to match tools with data collections and data collections with tools. In addition, a web interface was created to display basic information about ToolMatch model and to collect information about necessary characteristics of tools and data collections using simple web-based forms. Given the simple set of concepts and relationships, the form was able to return information about a tool or data collection using inferencing rules. User could then view tools and data collections as well as make changes to them, without necessarily needing to understand ontologies. Using the ToolMatch service, a data user could easily find one or more tools that can visualize their data collection, provided that the tools meet the requirements. Conversely, a tool user can discover data collections to be used by their tool, rather than wasting time and effort searching for a tool using conventional methods. 1.2 Third Use Case Since the initial scoping of the ToolMatch service, a third use case has been developed. This use case extends the first use case (semantic matching of tools and data collections) with additional conditions. Given a data collection and with additional 2 semantics in mind for that data collection (such as a domain model), find tools that can be used with the content of that data. For example, a researcher who has identified certain types of rainfall or topological measurements with a hydrological model wants to know what tools will work with the measurement data. This use case involves the definition and documentation of the extension, as well as the adaptation of the conceptual model, the ontology, and the ToolMatch web service. 1.3 Future Work With the first iteration of the ToolMatch service complete, we looked for ways to expand ToolMatch. In order to make further progess on the viability and robustness of the ToolMatch service, more instance data needs to be added to the knowledge store in the form of different kinds of visualization tools, and more data collections need to be added from a variety of science domains. Further populating the knowledge store will both confirm that the ToolMatch service meets the requirements of all three initial use cases, but also expand the applicability of the service to other science domains within the ESIP (Earth Science Information Partners) Federation as well. This will lead to the development of further use cases. 1.4 Thesis Outline This thesis will, in order, provide a brief overview of ontologies and semantic matching (section 2). This will be followed by a detailed section of the ToolMatch service, including the ontology and semantic matching of tools and data collections (section 3). The ToolMatch web service, an integral part of the use cases, will come after (section 4). Next will be explanation of the third use case of the ToolMatch service, as well as its changes to the ToolMatch ontology and web service (section 4). Finally, the last section will discuss future goals and work that ToolMatch hopes to achieve (section 5). Additional diagrams and the full ToolMatch schema (in N3) will be 3 provided in the appendices. 4 2 2.1 Historical Background Ontologies ToolMatch uses a lightweight ontology of concepts and properties relating to tools and data collections.An ontology is a set of axioms that models a specific domain, concepts or objects within that domain, and the properties and relations between those objects. Ontologies contain two types of properties: data properties and object properties. Data properties are between an individual and a literal, while object properties are between two separate individuals. Through the use of a reasoner, we can make inferences between properties that are implicitly contained in the ontology (Bechhofer et al. 2004). Ontologies are still one of the main areas of research in the realm of semantic web, with approximately 59 percent of semantic web papers in 2013 focusing on them (Menemencioglu and Orkak 2014). In order for ToolMatch to achieve the semantic matching between tools and data collections, SWRL (Semantic Web Rule Language) is used. 2.2 SWRL SWRL is based on a combination of the OWL DL and OWL Lite sublanguages of the Web Ontology Language with the RuleML sublanguages of the Rule Markup Language (Horrocks et al. 2004). The advantage of SWRL is that it enables Hornlike rules to be used in combination with any OWL knowledge store. A SWRL rule takes the form of an implication between an antecedent, or body, and its consequent, also known as a head. For any given rule, conditions are specified for the antecedent. If these conditions hold, or evaluate to true, any conditions specified in the consequent must be true as well (Horrocks et al. 2004). A very simple example of a SWRL rule is shown below: parent(?x,?y) ^ brother(?y,?z) -> uncle(?x,?z) 5 Given a an individual (?x) who has a parent (?y) and another individual who is the brother of that parent (?z), it can be inferred that individual ?z is the uncle of individual ?x. While very basic, this rule shows the potential of inferencing across ontologies. In summary, SWRL allows for rule-based reasoning across ontologies. These rules can be used by a reasoner to perform inferencing. In short, we use SWRL due to the fact that SWRL provides additional expressive power than OWL DL can by itself. For the ToolMatch service, the lightweight ontology is combined with SWRL rules to determine matching tools and data collections. 2.3 Semantic Matching Semantic matching involves the matching of relationships or concepts in ontologies that are semantically related, but often implicit. Much of the current work being done in semantic matching revolves around the idea of discovering and matching web services. For instance, in order to address the problem of service searching in serviceoriented applications researchers described a method which improved the quality and precision of discovered web services by using semantic web technologies (Lu, Hsu, Kuo 2013). This proposal included a web service discovery method that combined WordNet, domain ontologies, and SWRL rules in web service matchmaking. Another paper detailed a framework for preference-based semantic matching between web services and security policies (Alhazbi, Khan, and Erradi 2013). This again uses an ontology for domain modeling, but instead uses a matching algorithm applied to a Hermit reasoner to determine specify the best web security poilicy option to be mapped with provider capabilities. Finally, another paper put forward an automated approach to semantic annotation based on the DBpedia knowledge base, and provides a solid foundation for services discovery and automated service composition (Zhang, Chen and Feng 2013). These methods all revolve around the creation of an ontology to represent concepts and ideas through classes and object properties, along 6 with inferencing or other algorithms to semantically match web services. As will be explained in future sections, ToolMatch enables semantic matching to the applicable realm of data collections and tools. In addition, the ToolMatch web service offers the opportunity for users who are not ontologists to contribute and modify the ToolMatch knowledge store. 7 3 ToolMatch Service In order to properly match tools and data collections, ToolMatch uses an ontology and a set of rules that can help researchers determine what tools can be used given a particular set of data, or find data that can be used within a tool. For the purpose of this initial two use cases we clarify that a data collection can be ”used” by a tool when it can visualize the given data collection (represented by the visualizedBy object predicate). 3.1 ToolMatch Ontology The schema displayed in Figure 1 above shows the classes, object properties, and the predicates between them relating to the data collection part of the ToolMatch ontology. Here classes and object properties are entities and the object predicates are the relationships between them. Object predicates relating to the DataCollection class include hasDataFormat, isAccessedBy, hasAccessURL, usesConvention, and visualizedBy. While some of these may seem self-explanatory, we will nevertheless go through them for the sake of clarity In order to relate a data collection to the format in which it is stored (HDF, NetCDF, etc.), the predicate hasDataFormat is used. The term isAccessedBy states that a data collection can be used by a data server, where a data server is a piece of software that can access, manipulate, and return data products derived from data collections, including the data collections themselves. The hasAccessURL predicate relates each data collection to a URL, where a data collection can be addressed via that URL. UsesConvention relates a data collection with a data convention, which states that each data collection follows a particular set of agreed upon rules in its representation and metadata. Other properties that describe a DataCollection instance include 8 Figure 1: The ToolMatch ontology defines a vocabulary for data collections. Rounded nodes in the graph represent OWL classes. Edges (black dashed lines )represent relationships (described as RDF properties) between classes, while solid blue lines indicate that one class is a subclass of the class it points to. Different colors are used to separate ToolMatch ontology terms from terms from other ontologies. This diagram and other ontology diagrams were created using the CMAP ontology editor. dc:title and dc:description, which are data properties and incorporate DC (Dublin Core) terms (DCMI Usage Board 2012). Each data collection can be uniquely identified by either a DOI (Digital Object Identifier) or a GCMD-DIF (Global Change Master Directory - Directory Interchange Format) Entry ID (NASA 2014). Unique identifiers for data collections will be discussed more in the next section. Figure 2 showcases the Tool class along with object properties and predicates related to it. The object predicates include canUseAccessProtocol, canUseDataServer, hasInputFormat, hasOutputFormat, hasCapability, and isOfType. The first predicate, 9 Figure 2: The ToolMatch ontology defines a vocabulary for tools. Rounded nodes in the graph represent OWL classes. Edges (black dashed lines )represent relationships (described as RDF properties) between classes, while solid blue lines indicate that one class is a subclass of the class it points to. Different colors are used to separate ToolMatch ontology terms from terms from other ontologies. This diagram and other ontology diagrams were created using the CMAP ontology editor. canUseAccessProtocol, specifies the access protocols a tool can use to access data. The term AccessProtocol refers to a standard set of regulations and requirements governing the access of data electronically. Similar to isAccessedBy for data collections, canUseDataServer indicates which instances of the DataServer class that a Tool instance can interact with. HasInputFormat and hasOutputFormat detail what type of data a Tool instance can input or output, where the term DataFormat refers to organization of information according to preset specifications for the storage of data. HasCapability specifies the different facilities available through a Tool instance for visualizing data products. ToolType describes the method in which a tool can be used (desktop app, browser app, web service, etc.). The terms dc:title and dc:description are again used to describe the basic information about a tool (DCMI Usage Board 10 2012). The URL for the homepage of a Tool instance is noted by doap:homepage, where DOAP refers to the Description of a Project vocabulary. We list the version of a Tool instance by relating a tool to doap:Version through doap:release (Dumbill 2014). The isVisualizedBy object property relates a DataCollection and a Tool instance, and means that a data collection is able to be visualized by, or ”matches”, with a tool. How this matching is achieved will be explained in the following section. In summary, the goal of ToolMatch was to develop a simple but effective ontology that represented key aspects of both tools and data collections that would allow for future expansion. The ontology previously explained forms the basis of the ToolMatch service, and allows for this effective representation. The complete list of ToolMatch classes and object predicates can be viewed in Table 1 and Table 2, while the full RDF/XML schema for ToolMatch is located in the Appendix. 11 Table 1: ToolMatch Classes: Classes are the main concepts of the ToolMatch ontology. Classes may be the superclass or subclass of another class, or neither. Examples (instances) of each class are also shown. Class DataServer DataAccessProtocol DataCollection DataConvention DataFormat DataGridType Tool ToolType URL VisualizeType Superclass Subclass DataGridType Example OPeNDAP Hyrax DAP Aqua AIRS Level2 Plus AMSU CF Convention HDF4, HDF5, NETCDF DataConvention Tool ToolType, VisualizeType ERDDAP Desktop app, Browser app, Web service Tool Gridded, Mapped Table 2: ToolMatch Object Predicates: Object predicates relate one class in the ToolMatch ontology to another class or to a dataype (range). Examples (instances) of each class are also shown. Object Predicate Domain Range canUseAccessProtocol canUseDataServer hasAccessURL hasCapability hasDataFormat hasInputFormat hasOutputFormat isAccessedBy isOfType providesAccess usesConvention visualizedBy Tool, ToolType, VisualizeType Tool, ToolType, VisualizeType DataCollection Tool, ToolType, VisualizeType DataCollection Tool, ToolType, VisualizeType Tool, ToolType, VisualizeType Tool Tool, ToolType, VisualizeType DataServer DataCollection DataCollection DataAccessProtocol DataServer URL VisualizeType DataFormat DataFormat DataFormat DataCollection, DataServer ToolType DataAccessProtocol DataConvention, DataGridType Tool, ToolType, VisualizeType 12 3.2 ToolMatch Instances The instances of tools and data collections are shown in Figure 3. Note that a Tool instance may be a gridding tool or mapping tool, or neither. These are sublclasses of the Tool class and provide further specification as to the abilities of a tool. A tool instance can be uniquely identified by it’s name or by its URL. Note that if a tool belongs to a larger toolset, we do not count the toolset as an instance, but rather each tool within the larger grouping. While only a small grouping of tools is listed here, it is sufficient enough for testing matchings with data collections and vice versa. Continued expansion and better testing of ToolMatch requires that more instances of tools be added to the triple store. A data collection may be identified by either a DOI or a GCMD entry ID. The GCMD database contains more than 30,000 metatdata descriptions of earth science data collections, and strives to achieve the overall goal of providing scientists with ”comprehensive and high quality database to reduce overall expenditures for scientific data collection and dissemination” (NASA 2014). The GCMD-DIF Entry ID is determined by the metadata author for the data collection and for ToolMatch serves as an identifier to complement a DOI. A DOI acts as a unique identifier for a data collection, and allows for the construction and maintenance of metadata for that collection. While every data collection will have a DOI, not every one will have a GCMD-DIF Entry ID associated with it. Lastly, an access URL may be provided for a data collection. The access URL acts as a location for the landing page, which is maintained by the data collection creator. While an access URL is useful in locating metadata for a collection, a DOI or a GCMD entry ID is preferred. As with tools, more data collections are needed to better test and to expand the ToolMatch service. 13 Figure 3: Shown above are tool and data collection instances. Rounded nodes in the graph represent OWL classes. Square nodes represent instances. Edges represent relationships (described as RDF properties) between classes, while solid blue lines indicate that one class is a subclass of the class it points to. Note that a Tool instance may be a gridding tool or mapping tool, or neither. This diagram and other ontology diagrams were created using the CMAP ontology editor. 14 3.3 Semantic Matching of Tools and Data Collections Semantic matching is the primary goal of both ToolMatch use cases, be it matching data collections with tools or vice versa. This can be achieved through the use of inferencing between ToolMatch ontological properties. SWRL (Semantic Web Rule Language) in particular allows the creation of rules. The SWRL developed for these use cases (shown below), written in human readable syntax, finds tools that can visualize a given data collection and vice versa: DataCollection(?dc) ^ Tool(?t) ^ hasDataFormat(?dc, ?df) ^ isAccessedBy(?dc, ?ds) ^ canUseAccessProtocol(?t, ?p) ^ providesAccess(?ds, ?p) ^ hasInputFormat(?t, ?df) ^ canUseDataServer(?t, ?ds) => visualizedBy(?t, ?dc) Here we state that if an instance of a DataCollection(?dc) and a Tool(?t) both have the same DataFormat (through the use of hasDataFormat(?dc, ?df) and hasInputFormat(?t, ?df) properties), use the same DataServer (indicated by the isAccessedBy(?dc, ?ds) and canUseDataServer(?t, ?ds) properties), and that the DataServer and the Tool both use the same data access protocol (canUseAccessProtocol(?t,?p) and providesAccess(?ds, ?p)), then the DataCollection ”matches” with the Tool (visualizedBy(?t, ?dc)). Figure 4 provides a concrete example of semantic matching in ToolMatch. Here we are given a DataCollection instance, identified as the Aqua Airs Level2 Plus AMSU, which consists of ”retrieved estimates of cloud and surface properties, plus profiles of retrieved temperature, water vapor, ozone, carbon monoxide and methane” (AIRS Science Team 2013). This data collection has certain metadata properties (data server, data convention, data format, and grid type). From these properties we infer that the given data collection can be visualized by (matches) the IDV, McIDAS-V, and Panoply tools. If a data collection does not match with a tool, this means that no tools currently present in the ToolMatch knowledge store share the same metadata 15 characteristics with the data collection, or that not enough information about the data collection was entered by the data user. Therefore, the more metadata a user enters about a tool or data collection, the more likely they are to find a match. Figure 4: Inferencing Example: Given a data collection with certain properties described, we can infer what tools the data collection matches with. The ”equivalent class” section represents the given information for the first use case. The ”subclass of” section shows what is inferred from the given information, and is achieved by semantic matching of metadata for both tools and collections. 16 4 4.1 ToolMatch Web Service ToolMatch Web Service (Tools) The goal of the ToolMatch web service is to create a simple but effective user interface for data and tool users to find tools and/or data collections. For this service we assume these users are not ontologists, but are still interested in building formal models of an OWL ontology, testing the validity of the models, expressing rules using ontological concepts, and retrieving information via ontologically based queries. The service allows users to add, edit, and delete both tools and data collections from the ToolMatch triple store. Users may also query the ToolMatch triple store through the use of SPARQL, as well as view the ToolMatch schema in a readable format. Figure 5 displays an example of a layout of all current Tool instances. These tools are queried via SPARQL from the ToolMatch triple store. Here only the name of the tool itself is shown, but each entry leads to a splash page for a tool. Again, each tool can be edited or deleted, and a user can find matching data collections that the tool can visualize or map. Figure 6 shows a splash page for an individual Tool instance. This splash page displays the instance name, description, version, access URL, as well as an image for the tool and any input or output formats. This information is retrieved from the ToolMatch triple store via SPARQL queries as well. This page allows the user to see more detailed information regarding a tool instance. If necessary, the user can update any outdated information (such as the version of the tool) or correct any incorrect information about the tool instance. Again, only necessary information needed to identify the tool properly is provided. 17 Figure 5: The ToolMatch web service Tool List shows all Tool instances currently in the ToolMatch triple store. Each entry links to a splash page for the tool, and can be edited (via the tool form) or deleted, both of which use SPRAQL queries to make necessary changes. Users may also find matching data collections, which will display in a list similar to this one. Figure 7 displays the ToolMatch tool form. This allows the user to enter or edit information that describes a tool instance. This includes the tool name, description, version, and the tool page url, as well as other optional fields such as the tool logo, input and output formats, tool capabilities and the tool type. Once each tool is submitted, RDF/XML for a tool instance is created and then stored ToolMatch triple store as a graph. Sample RDF for the Tool instance Panoply can be view in the Appendices. 18 Figure 6: The ToolMatch web service splash page displays essential information about each Tool instance. This includes a brief description of the tool, the current version, the URL for the tool page, input and output formats, tool capability and tool type. Splash pages for data collections (not shown) are similar in this regard. 19 Figure 7: The ToolMatch web service tool form allows users to enter information to add or update a Tool instance. This includes the tool name, description, version, and the tool page URL, as well as other optional fields such as the tool logo, input and output formats, tool capabilities and the tool type. 20 4.2 ToolMatch Web Service (Data Collections) The web service for data collections is similar to that for tools, albeit with some differences. Figure 8 shows the first part of the data collection form that asks the user to enter one or more collection identifiers, consisting of a DOI, a GCMD entry ID, or an access URL for the data collection. If the user enters a DOI or access URL, the landing page for the data collection is shown in the following form. Similarly, if a GCMD entry ID is entered, the main form is pre-populated with information about the data collection. This prevents the user from having to search to find all the relevant information regarding a data collection. Instead, a user simply has to know the DOI, GCMD, or access URL for that data collection. Requiring less information from the user prevents data entry error and ensures greater accuracy about the data collection. Figure 9 details the main input form for data collections. Similar to the tool form, the data collection form asks for basic information about a data collection, including the name, description, a DOI, GCMD Entry ID, an access URL, as well as information about the data collection’s format, convention, and server accessibility. Once a data collection is entered it is added to the ToolMatch triple store and is then displayed in a list similar to tools where each entry leads to a splash page for each data collection. From there data collections can be edited or deleted, or a user can search for matching tools that can visualize or map the data collection. Data collections in the triple store can also be queried through SPARQL. The ToolMatch web service, in conclusion, is based on a simple ontology and set of rules that describe what type of tools work with what type of data collections, and vice versa. This helps to facilitate a crowd-sourced approach for domain experts who are not ontologists by allowing ToolMatch clients direct access to the knowledge base without needing to necessarily understand ontologies. However, the option to query 21 the ToolMatch triple store is still an option if a user wishes to use SPARQL. Figure 8: The ToolMatch web service data collection form prompts the user to enter a data collection DOI, GCMD entry ID, or access URL. Once submitted, the following form is pre-populated with information about the data collection if a GCMD entry ID is entered. If a DOI or access URL is entered, the data collection landing page is shown alongside the form. This prevents unnecessary searching of information about the data collection. 22 Figure 9: The ToolMatch web service data collection form prompts users to enter information to add or update a DataCollection instance. This includes the name, description, a DOI, GCMD Entry ID, an access URL, as well as information about the data collection’s format, convention, and server accessibility. 23 5 Third Use Case As stated in the introduction, the third use case extends the scope of the first use case but at the same time specifies its parameters. Given a data collection and with additional semantics in mind for that data collection, find tools that can be used with the content of that data. The third use case involves modification to the ToolMatch conceptual model and ontology, as well as to the ToolMatch web service. In order to implement the changes needed in the third use case, additions and modifications to the existing ToolMatch ontology must be made, many of which revolve around the Tool and DataCollection classes. Because we are attempting to match tools with data collection content, which includes measurements and sensor readings, ToolMatch will need a method of matching based upon these specific attributes. This will necessitate the expansion of the ToolMatch ontology, specifically for both the Tool and DataCollection classes. This requires the incorporation of the OGC (Open Geospatial Consortium) and ISO Observations and Measurements ontology. 5.1 Observation and Measurements Ontology The Observations and Measurements ontology (published as ISO/DIS 19156) defines a conceptual model for observations, and for features involved in sampling when making observations (OGC 2011). The goal of the ontology is to enable interoperability between scientific and technical communities. At the heart of the ontology is the observation. An observation can be define as ”an act associated with a discrete time instant or period through which a number, term or other symbol is assigned to a phenomenon” (Cox 2012). This is most often achieved through the use of a sensor or other instrument. The result of an observation, as described in the model itself is 24 ”an estimate of the value of a property of some feature” (Cox 2012). The Observations and Measurments ontology defines, for each observation, a feature of interest. This is incorporated through the General Feature Model (ISO 19109) (Cox 2012). A feature of interest is defined as a ”typed object with identity,” an example being vector data (Cox 2013). For each feature there is a feature type, which is defined by a characteristic set of properties, such as ”attributes, associations, operations”. These feature types are usually specific to an application domain, and map with objects that are exist in the real world, for instance a ”road, mine, truck, or storm” (Cox 2013). An XML observation example (taken from the OGC site) is shown in the appendices. Given that a data collection has data content, which contains observations that relate to a general feature type, we can semantically match a given tool with data collection content. 5.2 Changes in the ToolMatch Ontology Figures 10 and 11 show the changes to the ToolMatch ontology. The Tool class will undergo several changes. Each Tool instance will include the object predicate hasDomain, whose range will be the object property Domain. The domain class will can have multiple features. For example if given a hyrdological model, features could include: sinks, flow direction, flow accumulation, watersheds, stream networks, etc. The DataCollection class, on the other hand, sees several new additions. Each DataCollection instance will be linked with an observation instance by the object predicate hasObservation. Each observation will have a feature of interest (through the predicate featureOfInterest), which holds conceptual significance within the application domain. The matching here occurs between each data collection’s Feature property and each tool’s Domain property. The matching between these two properties will occur in addition to the previously stated conditions for matching, since the data 25 content must still have the same properties (data format, server accessibility, etc) . Figure 10: The updated ToolMatch ontology for tools takes into consideration the conditions of the third use case. In addition to the notations described in previous ontology figures, any changes are shown in purple. These changes include two classes from the Observations and Measurements ontology, as well as two object predicates detailing the relationship between the classes themselves and between the classes and the Tool class. 26 Figure 11: This figure displays the updated ToolMatch ontology for data collections takes into consideration the conditions of the third use case. In addition to the notations described in previous ontology figures, any changes are shown in purple. These changes include two classes from the Observations and Measurements ontology, as well as two object predicates detailing the relationship between the classes themselves and between the classes and the Data Collection class. 27 5.3 Semantic Matching This expansion of the ToolMatch ontology necessitates an enlargement of the existing SWRL rules as well. The third use case here requires matching between content and tools. The matching for tools and data collection content now takes in this added parameter of domain and features. Given data collection content, which has a domain and features that fall within that domain, find tools that match with that data collection content. If the tools meets all the requirements previously necessary for tools and data collections, and if both the tool and the content have the same features in an application domain (represented by the hasDOmain(?t, ?d), featureOfInterest(?d, ?f) for a tool and hasObservation(?dc, ?o) and featureOfInterest(?o, ?f) for data collection content), then it can be stated that they match (visualizedby(?t, ?dc)). The updated SWRL for matching a given data collection to a tool is shown below: DataCollection(?dc) ^ Tool(?t) ^ hasDataFormat(?dc, ?df) ^ isAccessedBy(?dc, ?ds) ^ canUseAccessProtocol(?t, ?p) ^ providesAccess(?ds, ?p)^ hasDomain(?t, ?d) ^ hasInputFormat(?t, ?df) ^ canUseDataServer(?t, ?ds) ^ hasObservation(?dc, ?o) ^ featureOfInterest(?o, ?f) => visualizedBy(?t, ?dc) In addition to the previous matching between the DataFormat and DataServer of a Tool and DataCollection instance, matching now occurs between the domain of a Tool instance and DataCollection content. Specifically, if a content’s observation’s featureOfInterest property belongs to a Tool’s domain, then the content and the tool are said to be matched. As an example we state that a user has watershed measurements, and needs a tool to model those measurements. ArcGIS offers a hyrdological toolset, containing mutliple tools that model the flow of water across a surface (ArcGIS Resource Center 2011). While one or more tools may or may not be useful with the content that a data user has, it would be time consuming to test each tool within the toolset to determine usefulness. If a tool within the toolset and the data content had the same features, and therefore the same domain, we could 28 determine that the tool could model the data collection content. From this matching, a user would be able to find a tool within a hyrdological model (based upon the domain of the tool) to visualize the data content. 5.4 Discussion and Conclusion The third use case expands the scope of the first use case, but at the same time specifies what type of domain data collection content can have. In this manner we can effectively match data collection content with tools that can be used with it. Further testing of this use case requires the addition of more instances for both tools and data collections. In addition, the ToolMatch web service must also be updated to reflect the changes already detailed in the ToolMatch ontology and SWRL rules. The primary changes would occur in the forms for both tools and data collections. The tool form would be expanded to allow a tool user to input the domain of the tool, as well as features that fall within a given domain. For data collections, the current form in place would be altered to let users provide observations for data collection content. Each observation would be associated with a feature of interest. In the future this could be expanded to allow a data user to simply submit the data collection content (as an RDF/XML or JSON object), and the form would automatically find what tools would work with that content. 29 6 Future Work As ToolMatch has begun adding information to the underlying Knowledge Store, it has become increasingly clear that employing other sources of information such as data catalogs, and data or tool registries is not only advisable but critical to leverage and scale a service like ToolMatch. In-depth analysis of the types of data collections, visualization tools, and technologies used by these data catalogs and registries will be necessary in order to understand how the ToolMatch service can use them in a practicable, scalable manner. This kind of analysis will also help the ToolMatch team move toward the goal of demonstrating how the service can be incorporated into existing sets of information services found at data and archive centers. Additionally, ToolMatch seeks to constantly incorporate new data collections and new tools, as well as annotations for each. This will allow for more thorough testing of the ToolMatch service on a wider scale for both tools and data collections. Finally, while the ToolMatch web service is designed for ”lay users”, or those who are less familiar with ontologies and SPARQL, it is not designed for those who are experts in those fields. Thus a further extension would be the option for users to submit data collections or tools through the use of SPARQL update queries or through submitting new triples directly to the ToolMatch knowledge base via a RESTful service. To date, most of the data collections and tools that have been included in the ToolMatch service have been NASA-generated. The initial interest in integrating a data catalog (a project initiated by the ESIP Energy and Climate cluster known as the Decision Support Tools Catalog and Community of Practice), fell through when development of the catalog was halted. In its place, other opportunities have arisen. In particular, several other communities represented within the ESIP Federation have expressed interest in using the ToolMatch service if it can meet their specific needs, 30 such as USGS (U.S Geological Survey) and NOAA (National Oceanic and Atomspheric Administration). These communities are especially interested in exploring the integration of data catalogs with ToolMatch. As described in the previous section, the ToolMatch web service must also be changed to reflect the modifications already present in the ToolMatch ontology and SWRL rules for the third use case. The addition of more instances for tools, and specifically for data collection content must be implemented. This will allow for sufficient testing of the inferencing developed for the third use case, and will ensure that data users can properly utilize the benefits of the ToolMatch service. Finally, it is the hope of ToolMatch that even more use cases will be developed that move beyond the scope of the first two use cases and the third use case. This could include the development of a use case where rules map entire classes of tools to classes of data collections. This use case represents a much broader use case in terms of scope, as a class here represents a large group of either tools or data collections instead of just one. These are the types of use cases that ToolMatch hopes to tackle in the future. 31 7 Conclusion Using a simple ontology created and a basic web service, ToolMatch is able to effectively allow data users to find tools to work with their data, and tool users to find data collections that the tools can visualize or map. This can avoid completely the wasted time and effort needed to conventionally search, as semantic matching does all the work for the user. A tool user would simply need to know the name of the tool that they have, while a data user would only need a unique identifier for a data collection (or the data collection itself). With the third use case, data users with specific content could also find matching tools within a toolset that the model might not fit itself. The ToolMatch service offers great potential for in the area of semantic matching, and promises to bring many benefits to the ESIP community, as well as other scientific domains that it could possibly expand to. Community testing of the ToolMatch service would help refine the understanding of the use cases underlying the service, and further the definition, design and implementation of the service. In addition, that testing and refinement would further demonstrate the utility of the underlying Semantic Web technologies that are focused upon science data collections and science data tools and move the science data community forward in this area. While the ToolMatch service works with the three simple but effective, making the service available openly to a broader community of data users and tool developers could persuade others to utilize, improve and expand the service. In addition, exploration of the factors involved in incorporating a ToolMatch service into other important and existent information service tools such as data,tool, and service catalogs and registries, and into existing information service suites offered by data centers and archives would greatly inform and influence the adoption and improvement of such a matching service. 32 References GES DISC (Goddard Earth Sciences Data and Information Services Center). “AIRX2RET Version 006: Aqua AIRS Level 2 Standard Physical Retrieval (AIRS AMSU).” Accessed November 18, 2014. http://disc.sci.gsfc.nasa.gov/datacollection/AIRX2RET V006.html?AIRX2RET. Alhazbi, S., K.M. Khan, and A. Erradi. “Preference-based semantic matching of web service security policies.” Paper presented at the World Congress on Computer and Information Technology (WCCIT), Sousse, Tunisia, June 22-24, 2013. ArcGIS Resource Center. “An Overview of the Hydrology Toolset”. Accessed November 17, 2014. http://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html Cox, Simon. “Overview of Some Relevant Standards from ISO/TC 211.” Last modified October 28, 2013. https://www.seegrid.csiro.au/wiki/AppSchemas/IsoTc211Standards. Cox, Simon. “OWL Representation of ISO 19156 (Observation Model).” Last modified July 24, 2012. http://def.seegrid.csiro.au/isotc211/iso19156/2011/observation. Dublin Core Metadata Initiative. “DCMI Metadata Terms.” Last modified June 14, 2012. http://dublincore.org/documents/dcmi-terms/. ESIP Tools Catalog. “Decision Support Tools Catalog and Community of Practice.” Accessed November 13, 2014. http://dstccp.esipfed.org/esip/. Dumbill, Edd. “Description of a Project.” Last modified January 26, 2014. https://github.com/edumbill/doap/wiki. 33 Ghomari, L. and A.R. Ghomari. “A Comparative Study: Syntactic versus Semantic Matching Systems.” Paper presented at the International Conference on Complex, Intelligent and Software Intensive Systems, Fukuoka, Japan, March 16-19, 2009. Horrocks, Ian, Peter F. Patel-Schneider, Harold Boley, Said Tabet, Benjamin Grosof, and Mike Dean. “SWRL: A Semantic Web Rule Language Combining OWL and RuleML.” Accessed November 17, 2014. http://www.w3.org/Submission/SWRL/. Kuhn, Werner. “A functional ontology of observation and measurement.” In GeoSpatial Semantics, 26-43. Springer Berlin Heidelberg: 2009. Lu, Shao Yuan, Kuo-Hsun Hsu, and Li-Jing Kuo. “A Semantic Service Match Approach Based on WordNet and SWRL Rules.” Paper presented at the e-Business Engineering (ICEBE), 2013 IEEE 10th International Conference, Coventry, United Kingdom, September 11-13, 2013. Luo, An, Yandong Wang, Lafang Wang, and You He. “Multi-level Semantic matching of Geospatial Web Services.” Paper presented at the International Conference on Geoinformatics, Fairfax, Virginia, August 12-14, 2009. Menemencioglu, O and I.M. Orkak. “A Review on Semantic Web and Recent Trends in Its Applications.” Paper presented at the IEEE International Conference on Semantic Computing (ICSC), Newport Beach, California, June 16-18, 2014. National Aeronautics and Space Administration. “Directory Interchange Format (DIF) Writer’s Guide.” Last modified January 1, 2014. http://gcmd.nasa.gov/add/difguide/. 34 Open Geospatial Consortium. “Observations and Measurements.” Accessed November 18, 2014. http://www.opengeospatial.org/standards/om. Peng, Hui. “Context Aware Semantic Web Services Description and Match.” Paper presented at the International Conference on Cyber-Enable Distributed Computing and Knowledge Discovery (CyberC), Beijing, China, Ocotber 10-12, 2013. Smith, Michael, Chris Welty, and Deborah McGuinness. “OWL Web Ontology Language Guide.” Accessed November 17, 2014. http://www.w3.org/TR/owl-guide/. World Wide Web Consortium. “OWL Representation of ISO 19109 (General Feature Model)”. Last modified July 1, 2012. http://def.seegrid.csiro.au/isotc211/iso19109/2005/feature#. World Wide Web Consortium. “OWL Web Ontology Language Reference.” Accessed November 17, 2014. http://www.w3.org/TR/owl-ref/. Zhang, Zhen, Shizhan Chen and Zhiyong Feng. “Semantic Annonation for Web Services Based on DBpedia.” Paper presented at IEEE 7th International Symposium on Service Oriented System Engineering (SOSE), Redwood City, California, March 25-28, 2013. 35 Appendices Listing 1: ToolMatch Schema (N3) @prefix : < http :// toolmatch . esipfed . org / schema # > . @prefix dcterms : < http :// purl . org / dc / terms / > . @prefix foaf : < http :// xmlns . com / foaf /0.1/ > . @prefix owl : < http :// www . w3 . org /2002/07/ owl # > . @prefix rdf : < http :// www . w3 . org /1999/02/22 - rdf - syntax - ns # > . @prefix rdfs : < http :// www . w3 . org /2000/01/ rdf - schema # > . @prefix time : < http :// www . w3 . org /2006/ time # > . @prefix tw : < http :// tw . rpi . edu / schema / > . @prefix twi : < http :// tw . rpi . edu / instances / > . @prefix xml : < http :// www . w3 . org / XML /1998/ namespace > . @prefix xsd : < http :// www . w3 . org /2001/ XMLSchema # > . < http :// toolmatch . esipfed . org / schema > a owl : Ontology ; rdfs : label " ToolMatch Ontology and Rules " @en ; dcterms : contributor twi : Christopher_Lynnes , twi : PatrickWest , < http :// tw . rpi . edu / instances / person / MatthewFerritto > , < http :// tw . rpi . edu / instances / person / NancyHoebelheinrich > ; dcterms : creator twi : EricRozell ; dcterms : date "2014 -07 -01" ; dcterms : publisher twi : E SIPFeder ation ; dcterms : rights " This ontology is distributed under a Creative Commons Attribution License http :// c re at iv e co mm o ns . org / licenses / by /3.0/" @en ; rdfs : comment """ ToolMatch is an ontology and set of rules that can help researchers determine what tools can be used given a particular set of data , or find data that can be used within a tool . """ @en ; owl : imports time : , foaf : ; owl : versionInfo " v 1.0 $Revision : 9878 $ $Author : pwest $ $Date : 2014 -07 -03 23:14:15 -0400 ( Thu , 03 Jul 2014) $ " . : DataGridType a owl : Class ; rdfs : label " Data Grid Type " @en ; rdfs : comment " Gridded data ( or raster data ) are the result of converting scattered individual data points from one or more sources into a regular \" grid \" ( or \" raster \") of calculated 36 values based on spacial parameters " @en ; rdfs : subClassOf : Dat aConven tion . : c a n U s e A c c e s s P r o t o c o l a owl : Object Property ; rdfs : label " Can Use Data Access Protocol " @en ; rdfs : comment """ Specifies the access protocols that the tool can use to access data """ @en ; rdfs : domain : Tool ; rdfs : range : D a t a A c c e s s P r o t o c o l . : canU s e D a t a S e r v e r a owl : Objec tPropert y ; rdfs : label " Can Use Data Server " @en ; rdfs : comment """ Specifies the type of data servers that the tool can interract with """ @en ; rdfs : domain : Tool ; rdfs : range : DataServer . : hasAccessURL a owl : Objec tProper ty ; rdfs : label " Has Access URL " @en ; rdfs : comment """ A data collection can be accessed via aparticular URL """ @en ; rdfs : domain : Da taColle ction ; rdfs : range : URL . : hasCapability a owl : Obj ectProp erty ; rdfs : label " Has Capability " @en ; rdfs : comment """ Specifies the different facilities available through the tool for visualizing data products """ @en ; rdfs : domain : Tool ; rdfs : range : VisualizeType . : hasDataFormat a owl : Obj ectProp erty ; rdfs : label " Has Data Format " @en ; rdfs : comment """ A Data Collection is stored in a particular data format , such as hdf5 , netcdf """ @en ; rdfs : domain : Da taColle ction ; rdfs : range : DataFormat . : hasInputFo rmat a owl : O bjectPro perty ; rdfs : label " Has Input Format " @en ; rdfs : comment """ A tool can input or import data products that are of a particular format , such as netcdf4 , hdf5 """ @en ; 37 rdfs : domain : Tool ; rdfs : range : DataFormat . : hasO u tp ut Fo r ma t a owl : ObjectP roperty ; rdfs : label " Has Output Format " @en ; rdfs : comment """ A tool can output data products in a particular format , such as jpg , json , netcdf4 """ @en ; rdfs : domain : Tool ; rdfs : range : DataFormat . : isAccessedBy a owl : Objec tProper ty ; rdfs : label " Is Accessed By " @en ; rdfs : comment """ A Data Collection can be accessed by a particular Server """ @en ; rdfs : domain : Da taColle ction ; rdfs : range : DataServer . : isOfType a owl : Obje ctPrope rty ; rdfs : label " Is Of Type " @en ; rdfs : comment """ The tool can be a desktop app , browser app , web service , etc ...""" @en ; rdfs : domain : Tool ; rdfs : range : ToolType . : providesAc cess a owl : O bjectPro perty ; rdfs : label " Provides Access " @en ; rdfs : comment """ A Data Server can provide access to data and information using a particular protocol """ @en ; rdfs : domain : DataServer ; rdfs : range : D a t a A c c e s s P r o t o c o l . : usesConven tion a owl : O bjectPro perty ; rdfs : label " Uses Convention " @en ; rdfs : comment """ A data collection follows a particular set of agreed upon rules in its repr esentati on and metadata """ @en ; rdfs : domain : Da taColle ction ; rdfs : range : Data Convent ion . : visualizedBy a owl : Objec tProper ty ; rdfs : label " Visualized By " @en ; 38 rdfs : comment """ A Data Collection can be visuzlied by a particular tool """ @en ; rdfs : domain : Da taColle ction ; rdfs : range : Tool . : ToolType a owl : Class ; rdfs : label " Tool Type " @en ; rdfs : comment " The method in which a tool can be used , such as as a desktop app , browser app , web service , etc ..." @en . : URL a owl : Class . : Da t a A c c e s s P r o t o c o l a owl : Class ; rdfs : label " Data Access Protocol " @en ; rdfs : comment " A standard set of regulations and requirements governing the access of data elec tronical ly " @en . : DataConven tion a owl : Class ; rdfs : label " Data Convention " @en ; rdfs : comment " A Data Convention is an established technique or practice related to the repr esentati on of data in data collections ." @en . : DataFormat a owl : Class ; rdfs : label " Data Format " @en ; rdfs : comment " The organization of information according to preset spec ificatio ns for the storage of data " @en . : DataServer a owl : Class ; rdfs : label " Data Server " @en ; rdfs : comment " A ToolMatch Data Server is a piece of software that can access , manipulate , and return data products derived from datasets , including the datasets themselves " @en . : DataCollec tion a owl : Class ; rdfs : label " Data Collection " @en ; rdfs : comment " A collection of related sets of information that is composed of separate elements but can be manipulated as a unit by a computer ." @en . : Tool a owl : Class ; rdfs : label " Tool " @en ; 39 rdfs : comment " A ToolMatch Tool is a computer - based utility that can be used for data access , manipulation , and visualization ." @en . Listed above is the full ToolMatch schema in N3 format, with annotations. Listed first are all object properties, followed by all class descriptions. Classes represent concepts within the ontology, while object properties act as the relationships between them. The N3 (Notation 3) format was developed with the purpose of human readability, as opposed to RDF/XML. 40 Figure 12: Shown above are ToolMatch instances for (from top to bottom) data formats, data grid types, data conventions, data access protocols, and data servers. Rounded nodes in the graph represent OWL classes. Square nodes represent instances. Edges represent relationships (described as RDF properties) between classes, while solid blue lines indicate that one class is a subclass of the class it points to. Note that a Tool instance may be a gridding tool or mapping tool, or neither. This diagram and other ontology diagrams were created using the CMAP ontology editor. 41 Figure 13: The code shown above represents the RDF/XML for the Tool instances Panoply. Name spaces are shown on the following page. 42 Table 3: A list of namespaces and prefixes used throughout the thesis. Prefix URI dc http://purl.org/dc/terms/ foaf http://xmlns.com/foaf/0.1/ time http://www.w3.org/2006/time# twi http://tw.rpi.edu/instances/ tw http://tw.rpi.edu/schema/ xsd http://www.w3.org/2001/XMLSchema# rdf http://www.w3.org/1999/02/22-rdf-syntax-ns# dcat http://www.w3.org/ns/dcat# doap http://usefulinc.com/ns/doap# owl http://www.w3.org/2002/07/owl# gcmd http://gcmd.gsfc.nasa.gov/Aboutus/xml/dif/ 43 Listing 2: Observation XML Example <! -- Observation example for sampling geometry extension for observations as defined in SOS extension TBD -- > < om :OM_Ob servat ion xmlns:om = " http: // www . opengis . net / om /2.0 " xmlns:xsi = " http: // www . w3 . org /2001/ XMLSchema - instance " xmlns:xlink = " http: // www . w3 . org /1999/ xlink " xmlns:gml = " http: // www . opengis . net / gml /3.2 " gml:id = " obsTest1 " x si: sc he ma Lo ca ti on = " http: // www . opengis . net / om /2.0 http: // schemas . opengis . net / om /2.0/ observation . xsd " > <! -- optional description of observation -- > < gml:description > Spatial observation test instance: water level </ gml:description > <! -- optional name of observation -- > < gml:name > Spatial observation test 1 </ gml:name > <! -- phenomenon time of observation -- > < om :pheno menonT ime > < gml:TimeInstant gml:id = " pt1 " > < gml:timePosition > 2010 -03 -08 T16:22:25 .00 </ gml:timePosition > </ gml:TimeInstant > </ om :pheno menonT ime > <! -- result time is same as phenomenon time of observation -- > < om:resultTime xlink:href = " # pt1 " / > <! -- link to DescribeSensor operation of SOS which is providing the sensor description -- > < om:procedure xlink:href = " http: // mySOSURL ? service = SOS & request = DescribeSensor & version =2.0.0& p ro c e du r e Id e n ti f i er = " procedure1 " / > <! - - parameter containing samplingPoint as defined in SOS 2.0 Extension - Data Encoding Restriction - - > < om:parameter > < om:NamedValue > < om:name xlink:href = " http: // www . opengis . net / req / omxml /2.0/ data / samplingGeometry " / > < om:value > 44 < gml:Point gml:id = " SamplingPoint " > < gml:pos srsName = " u r n : o g c : d e f : c r s : E P S G : 4 3 2 6 " >52.9 7.52 </ gml:pos > </ gml:Point > </ om:value > </ om:NamedValue > </ om:parameter > <! - - a notional URN identifying the observed property --> < o m: o b se r v ed P r op e r ty xlink:href = " http: // sweet . jpl . nasa . gov /2.0/ hydroSurface . owl # WaterHeight " / > <! - - a notional WFS call identifying the object regarding which the observation was made --> < o m : f e a t u r e O f I n t e r e s t xlink:href = " http: // wfs . example . org ? request = getFeature & featureid = river1 " / > <! - - The XML Schema type of the result is indicated using the value of the xsi:type attribute --> < om:result xsi:type = " gml:MeasureType " uom = " cm " >28 </ om:result > </ om:OM_Observation > Listed above is an example of XML for an observation. Important to note here is om:featureofInterest, which the Observation and Measurments ontology integrates from the General Feature model. This property will be incorporated into the ToolMatch ontology to describe observations in data collections and used in the matching of domains between data collection content and tools 45
© Copyright 2024