Creating Enterprise Data Model by Selecting and Extending Concepts from Cognite's Core Data Model¶

Prerequisite:

Basic understanding of Data Modeling in CDF
Basic understanding of Core Data Model
Access to a CDF Project.
Know how to install and setup Python.
Launch a Python notebook.

In this tutorial, we will show you how you can extend a core data model making your own extension specific to your domain. We will demonstrate this process by building a tiny wind farm data model.

Load NEAT methods and starting NeatSession¶

Interaction with NEAT is done through so-called NeatSession. NeatSession is typically instantiated with Cognite client which allows us to connect to CDF and read and write data models and instances. Therefore, we will import NeatSession and a convience method get_cognite_client:

In [1]:

Copied!

from cognite.neat import NeatSession, get_cognite_client
from cognite.neat import NeatSession, get_cognite_client

if you do not have .env file stored locally call get_cognite_client() first to create one:

In [2]:

Copied!

client = get_cognite_client(".env")
client = get_cognite_client(".env")

Found .env file in repository root. Loaded variables from .env file.

In [3]:

Copied!

neat = NeatSession(client)
neat = NeatSession(client)

Neat Engine 2.0.4 loaded.

Subset Core Data Model with desired set of concepts¶

Cognite's Core Data Model (short CDM) is a CDF system data model maintained by Cognite. To extend CDM, thus make our own extension specific to a domain, we will create references between CDM and concepts we define in our own data model.

CDM consists of 33 concepts (divided to 28 so called Core Concepts and 5 Core Features), majority of which are related to 3D (15+ concepts).

Since we are building a tiny wind farm data model, we will only select a small subset of concepts which will be turned into editable concepts that we can extend to tune for our needs. Specifically we will select following concepts:

CogniteAsset
CogniteEquipment
CogniteTimeSeries
CogniteActivity
CogniteDescribable

By extending the above subsset of core concepts we will create a wind farm data model which will contain the following concepts:

Location
WindFarm
WindTurbine
Substation
MetMast

To simplify this process we have create a convenience method neat.read.cdf.core_data_model() to create editable set of CDM concepts that we can extend.

Let's call this method and pass the list of desired CDM concepts:

In [4]:

Copied!

neat.read.cdf.core_data_model(
    ["CogniteAsset", "CogniteEquipment", "CogniteTimeSeries", "CogniteActivity", "CogniteDescribable"]
)
neat.read.cdf.core_data_model(
    ["CogniteAsset", "CogniteEquipment", "CogniteTimeSeries", "CogniteActivity", "CogniteDescribable"]
)

[WARNING] Experimental feature 'core_data_model_subsetting' is subject to change without notice

Out[4]:

Succeeded with warnings: Read NEAT(verified,physical,cdf_cdm,CogniteCore,v1)

	count
NeatIssue
NotNeatSupportedFilterWarning	7

Hint: Use the .inspect.issues() for more details.

Do not get confused with potential warnings you get when reading CDM into NeatSession. The warnings just point to users that filters are used in CDM. We typically strongly advise against usage of filters as it is easy to make mistakes when setting them.

Let's now inspect content of NeatSession by calling neat which will give us a summary of the created data model in NeatSession:

In [5]:

Copied!

neat
neat

Out[5]:

Data Model


aspect	physical
intended for	DMS Architect
name	CopyOf enterprise data model
space	my_space
external_id	MyCDMSubset
version	v1
views	38
containers	5
properties	21

By calling neat we are presented with an overview of data model that can be edited further in Excel to yield a tiny wind farm data model.

One can observe from the overview that we have 38 views, of which 33 views are corresponding to 33 CDM concepts, while additional 5 are editable version of desired concepts we selected. Due to the current UI limitations we are forced to incorporate 33 CDM concepts into our data model. This is temporal solution until updated version of UI is created.

As expected, there are only 5 containers in our data model, as they are there to match editable version of desired concepts (i.e. views) so we can add additional properties.

Before we proceed with editing data model in Excel let's update its data model id as well as name:

In [6]:

Copied!

neat.set.data_model_id(("wind_energy_space", "TinyWindFarmModel", "v1"), name="Tiny Wind Farm Model")
neat.set.data_model_id(("wind_energy_space", "TinyWindFarmModel", "v1"), name="Tiny Wind Farm Model")

Out[6]:

Success: NEAT(verified,physical,my_space,MyCDMSubset,v1) → NEAT(verified,physical,wind_energy_space,TinyWindFarmModel,v1)

In [7]:

Copied!

neat
neat

Out[7]:

Data Model


aspect	physical
intended for	DMS Architect
name	Tiny Wind Farm Model
space	wind_energy_space
external_id	TinyWindFarmModel
version	v1
views	38
containers	5
properties	21

NeatSession is restrictive when comes to possibility to manual edit data model, on other hand Excel enviroment provides much greater freedom. Therefore, let's now export data model in Excel format and continue editing it outside of NeatSession and notebook enviroment.

In [8]:

Copied!

neat.to.excel("wind-farm-data-model.xlsx")
neat.to.excel("wind-farm-data-model.xlsx")

Extending subset of CDM concepts¶

Inspecting exported Excel representation of data model one can see in details results of neat.subset.data_model.core_data_model method, which did the following:

Read Core Data Model into NeatSession
Creat an editable vesions of concepts we selected from CDM, names of which are prefixed by the CopyOf

neat will create CopyOfAsset, CopyOfEquipment, etc., and it will make sure that CopyOfAsset implements CogniteAsset, CopyOfEqupiment implements CogniteEquipment, etc.

Adjust connection between the editable versions of concepts

In CogniteEqupiment, property asset points to CogniteAsset, neat updates this connection in case of NeatOrgEqupimnt, such that it points to NeatOrgAsset instead. This is necessary update in order to consume data through your own concepts and not concepts of CDM, e.g. this will enable Search, pygen generated SDKs, GraphQL quering, to work as expected.

Add a dummy property to every editable concepts, which name if not specified will be in form of <nameOfConcept>GUID

There are a few purposes of this property. First, to show users how they can add new properties to the editable version of concepts, second by adding specific property to editable version of concepts one can skip adding filters to ensure consumption of data through user-defined concepts. These additional properties will be stored in new set of containers.
Add new containers to store additional properties of editable concepts which are not part of CDM concepts editable concepts are implementing

In Excel we will edit exported data model and produce desired wind farm data model. Specifically we will do the following steps:

Rename and futher extend editable concept

We would like to have location information for our assets, which would contain following properties:

name
description
latitude
longitude
and height.

Since name and description are part of CopyOfDesribable concept, through implementation of CognieDescribable, we will:

Rename CopyOfDescribable to Location
Add properties latitude, longitude and height to Location concept

Add units to properties

We will set also units to latitude, longitude and height. Specifically, we will set degree to latitude and longitude, while meter to height. This is done by specifying Value Type with unit, e.g. float(unit=angle:deg) (list of units and their external ids can be found here)

Update dummy property

We will rename property neatOrgAssetGUID, which is added to CopyOfAsset concept, to property location, set the connection type to be direct and update value type to Location.

Create new concepts out of editable concept

We will create:

WindFarm
WindTurbine
Substation
MetMast

concepts by implementing CopyOfAsset and adding following specific properties repespectively:

capacityFactor, which value type will be float32
activePower, which value type will be float32
voltageLevel, which value type will be float32
iecCompliant, which value will be boolean

Add explicit connection between new concepts

We would like to have explicit connection between WindFarm and its underlaying asset WindTurbine, Substation and MetMast. To achieve this we will create direct connection:

from WindTurbine to WindFarm via property windFarm
from Substation to WindFarm via property windFarm
and MetMast to WindFarm via property windFarm

In addition we will create the reverse connection based of these properties:

from WindFarm to WindTurbine via property windTurbine
from WindFarm to Substation via property substation
from WindFarm to MetMast via property metMast

You will notice that direct connection require storage, therefore we are mapping View properties to Container properites. On the other hand reverse connection do not require storage, so we are not mapping View properties to Container properties.

Update metadata

We will finally update description of data model in Metadata sheet

Read edited data model and upload it to CDF¶

We will read in manually edited Excel file into NeatSession using ...read.excel(filename, enable_manual_edit=True). Beware that we are setting argument enable_manual_edit to True which signals to neat to try to read in manually edited data model and join it into the provenance trail.

You can download wind-farm-data-model-manual-edited.xlsx

In [9]:

Copied!

neat.read.excel("wind-farm-data-model-manual-edited.xlsx", enable_manual_edit=True)
neat.read.excel("wind-farm-data-model-manual-edited.xlsx", enable_manual_edit=True)

[WARNING] Experimental feature 'enable_manual_edit' is subject to change without notice

Out[9]:

Succeeded with warnings: Read NEAT(verified,physical,wind_energy_space,TinyWindFarmModel,v1)

	count
NeatIssue
NeatValueWarning	18
NotNeatSupportedFilterWarning	7

Hint: Use the .inspect.issues() for more details.

Finally let's push data model to CDF:

Let's inspect the outcome of data model deployement:

In [11]:

Copied!

neat.inspect.outcome.data_model()
neat.inspect.outcome.data_model()

spaces¶

unchanged¶

wind_energy_space

containers¶

unchanged¶

ContainerId(space='wind_energy_space', external_id='Location')
ContainerId(space='wind_energy_space', external_id='WindTurbine')
ContainerId(space='wind_energy_space', external_id='CopyOfEquipment')
ContainerId(space='wind_energy_space', external_id='MetMast')
ContainerId(space='wind_energy_space', external_id='CopyOfTimeSeries')
ContainerId(space='wind_energy_space', external_id='CopyOfAsset')
ContainerId(space='wind_energy_space', external_id='WindFarm')
ContainerId(space='wind_energy_space', external_id='CopyOfActivity')
ContainerId(space='wind_energy_space', external_id='Substation')

views¶

unchanged¶

ViewId(space='wind_energy_space', external_id='Location', version='v1')
ViewId(space='wind_energy_space', external_id='WindTurbine', version='v1')
ViewId(space='wind_energy_space', external_id='WindFarm', version='v1')
ViewId(space='wind_energy_space', external_id='CopyOfEquipment', version='v1')
ViewId(space='wind_energy_space', external_id='CopyOfAsset', version='v1')
ViewId(space='wind_energy_space', external_id='Substation', version='v1')
ViewId(space='wind_energy_space', external_id='CopyOfActivity', version='v1')
ViewId(space='wind_energy_space', external_id='MetMast', version='v1')
ViewId(space='wind_energy_space', external_id='CopyOfTimeSeries', version='v1')

data_models¶

unchanged¶

DataModelId(space='wind_energy_space', external_id='TinyWindFarmModel', version='v1')

nodes¶

Let visualize a full provenance from the begining til the end:

In [12]:

Copied!

neat.show.data_model.provenance()
neat.show.data_model.provenance()

data_model_provenance_c2bd65be.html

Out[12]:

In [ ]: