Creating a connector to access a third party technology

When you want to connect to a tool or system from an existing service, you need to create an open connector for that tool or system. For example, you might want to connect a new metadata repository into Egeria, or connect Egeria with a new data processing engine.

To write an open connector you need to complete four steps:

  1. Identify the properties for the connection
  2. Write the connector provider
  3. Understand the interface to which the connector needs to implement
  4. Write the connector itself

All of the code you write to implement these should exist in its own module, and as illustrated by the examples could even be in its own independent code repository. Their implementation will have dependencies on Egeria’s

However there is no dependency on Egeria’s OMAG Server Platform on these specific connector implementations and they could run in another runtime that supported the connector APIs.

1. Identify the properties for the connection

Begin by identifying and designing the properties needed to connect to your tool or system. These will commonly include a network address, protocol, and user credentials, but could also include other information.

2. Write the connector provider

The connector provider is a simple Java factory that implements the creation of the connector type it can instantiates using:

For example, the DataStageConnectorProvider is used to instantiate connectors to IBM DataStage data processing engines. Therefore its name and description refer to DataStage, and the connectors it instantiates are DataStageConnectors.

Similarly, the IGCOMRSRepositoryConnectorProvider is used to instantiate connectors to IBM Information Governance Catalog (IGC) metadata repositories. In contrast to the DataStageConnectorProvider, the IGCOMRSRepositoryConnectorProvider’s name and description refer to IGC, and the connectors it instantiates are IGCOMRSRepositoryConnectors.

Note that the code of all of these connector implementations exists outside Egeria itself (in separate code repositories), and there are no dependencies within Egeria on these external repositories.

All connectors can be configured with the network address and credential information needed to access the underlying tool or system. Therefore, we do not need to explicitly list properties for such basic details. However, the names of any additional configuration properties that may be useful to a specific type of connector can be described through the recognizedConfigurationProperties of the connector type.

The basic implementation pattern

From the two examples (DataStageConnectorProvider and IGCOMRSRepositoryConnectorProvider), you will see that writing a connector provider follows a simple pattern:

3. Understand the interface to which the connector needs to adhere

Now that we have the connector provider to instantiate our connectors, we need to understand what our connectors actually need to do. For a service to use our connector, the connector must provide a set of methods that are relevant to that service.

For example, the Data Engine Proxy Services integrate metadata from data engines with Egeria. To integrate DataStage with Egeria, we want our DataStageConnector to be used by the data engine proxy services. Therefore the connector needs to extend DataEngineConnectorBase, because this defines the methods needed by the data engine proxy services.

Likewise, we want our IGCOMRSRepositoryConnector to integrate IGC with Egeria as a metadata repository. Therefore the connector needs to extend OMRSRepositoryConnector, because this defines the methods needed to integrate with Open Metadata Repository Services (OMRS).

How did we know to extend these base classes? The connector provider implementations in the previous step each extended a base class specific to the type of connector they provide (DataEngineConnectorProviderBase and OMRSRepositoryConnectorProviderBase). These connector base classes (DataEngineConnectorBase and OMRSRepositoryConnector) are in the same package structure as those connector provider base classes.

In both cases, by extending the abstract classes (DataEngineConnectorBase and OMRSRepositoryConnector) our connector must implement the methods these abstract classes define. These general methods implement our services (Data Engine Proxy Services and OMRS), without needing to know anything about the underlying technology. Therefore, we can simply “plug-in” the underlying technology: any technology with a connector that implements these methods can run our service. Furthermore, each technology-specific connector can decide how best to implement those methods for itself.

4. Write the connector itself

Which brings us to writing the connector itself. Now that we understand the interface our connector must provide, we need to implement the methods defined by that interface.

Implement the connector by:

  1. Retrieving connection information provided by the configuration. The default method for initialize saves the connection object used to create the connector. If your connector needs to override the initialize method, it should call super.initialize() to capture the connection properties for the base classes.
  2. The start() method is where the main logic for your connector runs.
    Use the configuration details from the connection object to connect to your underlying technology. If the connector is long running, this may be the time to start up a separate thread. However, this has to conform the rules laid down for the category of connector you are implementing.
  3. Using pre-existing, technology-specific clients and APIs to talk to your underlying technology.
  4. Translating the underlying technology’s representation of information into the Open Metadata representation used by the connector interface itself.

For the first point, you can retrieve general connection information like:

Use these details to connect to and authenticate against your underlying technology, even when it is running on a different system from the connector itself. Of course, check for null objects (like the EndpointProperties) as well before blindly operating on them.

Retrieve additional properties by:

Implementation of the remaining points (2-3) will vary widely depending on the specific technology being used.



License: CC BY 4.0, Copyright Contributors to the ODPi Egeria project.