Discovery Engine Open Metadata Access Service (OMAS)
The Discovery Engine OMAS provides APIs and events for metadata discovery tools that are surveying the data landscape and recording information in metadata repositories.
The Open Discovery Framework (ODF) provides a comprehensive set of open APIs that describe the interaction between metadata discovery tools and a metadata server. The aim is to make it easy for metadata discovery tools to work with open metadata repositories.
The capabilities defined in the ODF fall into 4 broad categories.
- The metadata server APIs - these are implemented by the Discovery Engine OMAS and include:
- Discovery configuration API - for configuring discovery engines and services - and also retrieving this configuration.
- Asset catalog API - for finding assets in the metadata repository.
- Asset store API - for retrieving a specific asset’s metadata and connector.
- Annotation store API - for storing new metadata about the asset.
- The discovery services - these are the specialist plugin services that each perform a particular type of analysis. These are implemented by the metadata discovery tool (or interface with the discovery tool’s APIs to drive specific types of analysis).
- The discovery engines - these manage the work of a collection of related discovery services.
- The discovery server - this hosts one or more discovery engines. It provides a REST API to request specific analysis on particular assets, monitor progress of the discovery services and review the results.
Figure 1 shows how these capabilities work together.
Figure 1: Interfaces of the Discovery Engine OMAS
The configuration of the discovery engines and the discovery services that they support are managed in the metadata server through the Discovery Engine OMAS.
The Discovery Server is typically located close to the data assets to minimize the network traffic resulting from the analysis. Where the data assets are distributed in multiple locations, it is possible to deploy a Discovery Server in each location so the discovery workload is kept close to the data.
A single Discovery Engine OMAS can support multiple discovery servers deployed in this way.
Each discovery server is configured with the location of the metadata server where the Discovery Engine OMAS is running along with the names of the discovery engines it will host. The same discovery engine can simultaneously run on multiple discovery servers. This means the discovery server can host all of the discovery engines it needs to analyse the assets at its location.
When the discovery server starts, it calls the Discovery Engine OMAS to retrieve the configuration for each of its discovery engines (see Figure 1, number 1). It also connects to the Discovery Engine OMAS’s out topic to receive any updates on this configuration while it is running.
Within the discovery engine’s configuration are the list of discovery request types it supports that are in turn each linked to the discovery service that should run when one of these discovery types is requested to be run against a specific asset. This is shown in figure 2.
Figure 2: Discovery Engine Configuration
When a discovery request is made, the discovery engine creates an instance of the discovery service and gives it access to a discovery context. The discovery context provides access to existing metadata known about the Asset, a connector to access the data stored in the asset and a store to record the new metadata it has discovered about the asset. Behind the scenes, the discovery context is calling the Discovery Engine OMAS to both retrieve metadata about the Asset and its connector (see Figure 1, number 2], and to store the new metadata (Figure 1, number 3).
The Open Discovery Framework (ODF) provides more information about the discovery engines and discovery services along with the metadata APIs.
In Egeria, both the metadata server where the Discovery Engine OMAS runs and the discovery server are types of OMAG Servers. More information on the operation of the discovery server can be found under the Discovery Engine Services. These services belong to the specialist subsystem of the discovery server.
The module structure for the Discovery Engine OMAS is as follows:
discovery-engine-client supports the client library that is used by the discovery server (and the discovery engines and discovery services it hosts) to access the Discovery Engine OMAS’s REST API and out topic.
discovery-engine-api supports the common Java classes that are used both by the client and the server. Since the Open Discovery Framework (ODF) defines most of the interfaces for the Discovery Engine OMAS, this module only needs to provide the interfaces associated with the out topic.
discovery-engine-connectors supports the connector implementations for the out topic - both client side and server side.
discovery-engine-server supports in implementation of the metadata interfaces defined by the Open Discovery Framework (ODF) and its related event management.
discovery-engine-spring supports the REST API using the Spring libraries. This module has no business logic associated with it. Each REST API endpoint delegates immediately to an equivalent function in the server module. It is, however, a useful place to look to get a view of the REST API supported by this OMAS.
Return to the access-services module.
License: CC BY 4.0, Copyright Contributors to the ODPi Egeria project.