Datamesh
Architecture

Architecture

The Oceanum Datamesh acts a single access point for environmental data. This includes your own data which may be stored on your own data services, or on the Datamesh itself, and data from numerous agencies providing environmental data.

image

The Datamesh has 3 main components which work together to provide a single unified service for finding and accessing data:

  1. Registry

  2. Storage

  3. Gateway

Datamesh registry

All datasources on the Datamesh are entered in a registery, which stores details about the data including:

  • A unique ID

  • Connection details, external data service details or internal storage location

  • Its geometry, where it located spatially

  • Temporal extent

  • Variables or properties

  • Human readable name and description

  • Keyword tags

  • External links to supporting information

  • Arbitrary additional metadata in key/value pairs

The metadata in the registry allows for the searching of datasources by different dimensions and filters. Metadata is separate to the the data in the datasource itself and can be changed without touching he underlying data.

NOTE: The connection details, which could include passwords or keys, is not exposed to end-users of the data. The allows the data itself to be shared without revealing sensitive credentials.

Datamesh storage

The Datamesh has a storage system which allows for any suitable user data to be uploaded and stored in the Datamesh cloud. When you upload data to the Storage system, much of the necessary metadata for registration is scraped from the data itself.

Datamesh gateway

The Datamesh gateway is the access point for all data on the datamesh. All data is accessed through the Datamesh APIs. The gateway takes a user request for data and authenticates, authorises, executes the request query, and delivers the data in the specified format.

Query engine

The gateway query engine executes the user request and carries out a number of functions:

  • Spatial subsetting, interpolation and downsampling

  • Temporal subsetting and/or downsampling

  • Variable selection

  • Aggregation operations

Format conversion

The gateway supports most community standard formats for the different types of datasource supported by the Datamesh. These include:

  • (Geo)JSON

  • NetCDF

  • GeoTIFF

  • CSV

  • Excel

  • Parquet

  • Arrow

  • ESRI Shapefiles

NOTE: The format conversions available for each request depend on the underlying datasource structure and the size of the data request.

Authorisation

Every request to the Datamesh must be authenticated. If connecting from an Oceanum.io service, such as the Datamesh UI, this will be handled automatically using your Oceanum.io account. When connecting directly to Datamesh APIs, you will need to provide you Datamesh token.

Access to datasources is granted on a per-organisation basis. If you have sharing permissions on a datasource, you are free to share that datasource with other organisations, or with all users.

Caching

The gateway has a high performance caching layer which optimises the internal access of data, improving access speed, and reducing load on external downstream data services.

Connecting to external data services

The Datamesh is able to connect to many existing third-party data services, including:

  • SQL servers

  • OGC Web Feature Service (WFS)

  • OpenDAP

  • ZARR archives

  • ESRI Image and Feature services

The internal architecture of the Datamesh is designed to make it easy to connect most services exposing structured data. This can include proprietry specifications and protocols. Contact us to enquire about connecting your data service to the Datamesh.