Semantic Data Cube

The Semantic Data Cube enables the semantic enrichment of the Earth System Data Cube (ESDC). The Semantic Data Cube allows users to query EO data, other Linked Open Data, and information/knowledge extracted from the data, using a semantic query language, thus creating new value chains.

The core of the Semantic Data Cube is Ontop-spatial. Ontop-spatial creates virtual geospatial Resource Description Framework graphs – commonly known as RDF graphs – on top of geospatial data models, such as the ones supported by ESDC. The geometries are then mapped to GeoSPARQL geometry literals using ontologies and R2RML/OBDA mappings. Ontop-spatial can be used as a standard SPARQL endpoint that can execute GeoSPARQL queries on top of ESDC. Therefore, it can be used in complement with other tools that produce, manage, explore, and visualize geospatial RDF data.

The new semantic data cube technology we have developed, allows us to express the following classes of queries:

  1. Queries on the Earth Observation data.
  2. Semantic queries on the low-level content of the image.
  3. Semantic queries on the high-level content of the image.
  4. Any of the above query classes together with a spatial and temporal extent.
  5. Any of the above query classes together with a reference to an external data source.

In current data cubes, only queries of Class 1 and 2 are possible. In the Semantic Data Cubes implemented in DeepCube, all of the above classes of queries will be possible. This is a significant research output of the project and supports the implementation of three Use Cases: climate induced migration in Africa, fire hazard forecasting in the Mediterranean, and Copernicus services for sustainable tourism.


The architectural overview of the system is shown in the Figure below. Ontop-spatial developed by UoA is the first OBDA system with GeoSPARQL support and it is a fork of an older version of Ontop (1.18). Since then, Ontop has been substantially improved and continues to go under heavy development and maintenance. The most recent official version (version 4) also supports GeoSPARQL. We decided to use Ontop v4 in DeepCube, as the original developers were eager to provide assistance when needed.

Architecture of Semantic Data Cube

We consider that a datacube essentially consists of screened, or Analysis Ready Data (ARD), with the dimensions “latitude”, “longitude”, “time”, “variable”. Further dimensions can be added as a result of an analysis. In DeepCube, the semantic data cube technology will be applied to the following Use Cases:

  • UC1 – Forecasting extreme drought and heat impacts in Africa
  • UC3 – Fire hazard short term forecasting in the Mediterranean
  • UC5 – Copernicus services for sustainable tourism

Foreign Data Wrappers

In 2003, a new specification called SQL/MED (“SQL Management of External Data”) was added to the SQL standard. It is a standardized way of handling access to remote objects from SQL databases. In 2011, PostgreSQL 9.1 was released with read-only support of this standard, and in 2013 write support was added with PostgreSQL 9.3. There are now a variety of Foreign Data Wrappers (FDW) available which enable PostgreSQL Server to different remote data stores, ranging from other SQL databases through to flat files. The one used in the project’s pipeline needs to handle local or remote data cubes. We achieve this through the use of Multicorn. Multicorn is a PostgreSQL 9.1+ extension meant to make Foreign Data Wrapper development easy, by allowing the programmer to use the Python programming language. We are able to handle each datacube by developing a different multicorn foreign data wrapper for the Use Cases 2, 3 and 5.

The Semantic Data Cube is the first international approach in this technology and has been implemented by extending the geospatial ontology-based data access system Ontop-spatial. The new system alllows users to query Earth Observation data and information/knowledge extracted from the data, using a semantic query language. This query is rewritten using ontology axioms and mappings, and is executed at the data sources (data cubes and other external data sources). The answers are collected and returned to the user.

Interested in learning more? Contact us!
Manolis Koubarakis,
George Stamoulis,