Creating Enterprise Data Model by Selecting and Extending Concepts from Cognite's Core Data Model¶
Prerequisite:
- Basic understanding of Data Modeling in CDF
- Basic understanding of Core Data Model
- Access to a CDF Project.
- Know how to install and setup Python.
- Launch a Python notebook.
In this tutorial, we will show you how you can extend a core data model making your own extension specific to your domain. We will demonstrate this process by building a tiny wind farm data model.
Load NEAT methods and starting NeatSession¶
Interaction with NEAT is done through so-called NeatSession. NeatSession
is typically instantiated with Cognite client which allows us to connect to CDF and read and write data models and instances. Therefore, we will import NeatSession
and a convience method get_cognite_client
:
from cognite.neat import NeatSession, get_cognite_client
if you do not have
.env
file stored locally callget_cognite_client()
first to create one:
client = get_cognite_client(".env")
Found .env file in repository root. Loaded variables from .env file.
neat = NeatSession(client)
Neat Engine 2.0.4 loaded.
Subset Core Data Model with desired set of concepts¶
Cognite's Core Data Model (short CDM
) is a CDF system data model maintained by Cognite
. To extend CDM, thus make our own extension specific to a domain, we will create references between CDM and concepts we define in our own data model.
CDM consists of 33 concepts (divided to 28 so called Core Concepts and 5 Core Features), majority of which are related to 3D (15+ concepts).
Since we are building a tiny wind farm data model, we will only select a small subset of concepts which will be turned into editable concepts that we can extend to tune for our needs. Specifically we will select following concepts:
CogniteAsset
CogniteEquipment
CogniteTimeSeries
CogniteActivity
CogniteDescribable
By extending the above subsset of core concepts we will create a wind farm data model which will contain the following concepts:
Location
WindFarm
WindTurbine
Substation
MetMast
To simplify this process we have create a convenience method neat.read.cdf.core_data_model()
to create editable set of CDM concepts that we can extend.
Let's call this method and pass the list of desired CDM concepts:
neat.read.cdf.core_data_model(
["CogniteAsset", "CogniteEquipment", "CogniteTimeSeries", "CogniteActivity", "CogniteDescribable"]
)
[WARNING] Experimental feature 'core_data_model_subsetting' is subject to change without notice
Succeeded with warnings: Read NEAT(verified,physical,cdf_cdm,CogniteCore,v1)
count | |
---|---|
NeatIssue | |
NotNeatSupportedFilterWarning | 7 |
Hint: Use the .inspect.issues() for more details.
Do not get confused with potential warnings you get when reading CDM into
NeatSession
. The warnings just point to users that filters are used in CDM. We typically strongly advise against usage of filters as it is easy to make mistakes when setting them.
Let's now inspect content of NeatSession by calling neat
which will give us a summary of the created data model in NeatSession
:
neat
Data Model
aspect | physical |
---|---|
intended for | DMS Architect |
name | CopyOf enterprise data model |
space | my_space |
external_id | MyCDMSubset |
version | v1 |
views | 38 |
containers | 5 |
properties | 21 |
By calling neat
we are presented with an overview of data model that can be edited further in Excel to yield a tiny wind farm data model.
One can observe from the overview that we have 38 views, of which 33 views are corresponding to 33 CDM concepts, while additional 5 are editable version of desired concepts we selected. Due to the current UI limitations we are forced to incorporate 33 CDM concepts into our data model. This is temporal solution until updated version of UI is created.
As expected, there are only 5 containers in our data model, as they are there to match editable version of desired concepts (i.e. views) so we can add additional properties.
Before we proceed with editing data model in Excel let's update its data model id as well as name:
neat.set.data_model_id(("wind_energy_space", "TinyWindFarmModel", "v1"), name="Tiny Wind Farm Model")
Success: NEAT(verified,physical,my_space,MyCDMSubset,v1) → NEAT(verified,physical,wind_energy_space,TinyWindFarmModel,v1)
neat
Data Model
aspect | physical |
---|---|
intended for | DMS Architect |
name | Tiny Wind Farm Model |
space | wind_energy_space |
external_id | TinyWindFarmModel |
version | v1 |
views | 38 |
containers | 5 |
properties | 21 |
NeatSession is restrictive when comes to possibility to manual edit data model, on other hand Excel enviroment provides much greater freedom.
Therefore, let's now export data model in Excel format and continue editing it outside of NeatSession
and notebook enviroment.
neat.to.excel("wind-farm-data-model.xlsx")
Extending subset of CDM concepts¶
Inspecting exported Excel representation of data model one can see in details results of neat.subset.data_model.core_data_model
method, which did the following:
Read Core Data Model into
NeatSession
Creat an editable vesions of concepts we selected from CDM, names of which are prefixed by the
CopyOf
neat will create
CopyOfAsset
,CopyOfEquipment
, etc., and it will make sure thatCopyOfAsset
implementsCogniteAsset
,CopyOfEqupiment
implementsCogniteEquipment
, etc.
- Adjust connection between the editable versions of concepts
In
CogniteEqupiment
, propertyasset
points toCogniteAsset
, neat updates this connection in case ofNeatOrgEqupimnt
, such that it points toNeatOrgAsset
instead. This is necessary update in order to consume data through your own concepts and not concepts of CDM, e.g. this will enable Search, pygen generated SDKs, GraphQL quering, to work as expected.
Add a dummy property to every editable concepts, which name if not specified will be in form of
<nameOfConcept>GUID
There are a few purposes of this property. First, to show users how they can add new properties to the editable version of concepts, second by adding specific property to editable version of concepts one can skip adding filters to ensure consumption of data through user-defined concepts. These additional properties will be stored in new set of containers.
Add new containers to store additional properties of editable concepts which are not part of CDM concepts editable concepts are implementing
In Excel we will edit exported data model and produce desired wind farm data model. Specifically we will do the following steps:
- Rename and futher extend editable concept
We would like to have location information for our assets, which would contain following properties:
- name
- description
- latitude
- longitude
- and height.
Since name
and description
are part of CopyOfDesribable
concept, through implementation of CognieDescribable
, we will:
- Rename
CopyOfDescribable
toLocation
- Add properties latitude, longitude and height to
Location
concept
- Add units to properties
We will set also units to latitude
, longitude
and height
. Specifically, we will set degree to latitude
and longitude
, while meter to height
. This is done by specifying Value Type
with unit, e.g. float(unit=angle:deg)
(list of units and their external ids can be found here)
- Update dummy property
We will rename property neatOrgAssetGUID
, which is added to CopyOfAsset
concept, to property location
, set the connection type to be direct
and update value type
to Location
.
- Create new concepts out of editable concept
We will create:
WindFarm
WindTurbine
Substation
MetMast
concepts by implementing CopyOfAsset
and adding following specific properties repespectively:
capacityFactor
, which value type will befloat32
activePower
, which value type will befloat32
voltageLevel
, which value type will befloat32
iecCompliant
, which value will beboolean
- Add explicit connection between new concepts
We would like to have explicit connection between WindFarm
and its underlaying asset WindTurbine
, Substation
and MetMast
. To achieve this we will create direct connection:
- from
WindTurbine
toWindFarm
via propertywindFarm
- from
Substation
toWindFarm
via propertywindFarm
- and
MetMast
toWindFarm
via propertywindFarm
In addition we will create the reverse connection based of these properties:
- from
WindFarm
toWindTurbine
via propertywindTurbine
- from
WindFarm
toSubstation
via propertysubstation
- from
WindFarm
toMetMast
via propertymetMast
You will notice that direct connection require storage, therefore we are mapping View properties to Container properites. On the other hand reverse connection do not require storage, so we are not mapping View properties to Container properties.
- Update metadata
We will finally update description of data model in Metadata
sheet
Read edited data model and upload it to CDF¶
We will read in manually edited Excel file into NeatSession
using ...read.excel(filename, enable_manual_edit=True)
. Beware that we are setting argument enable_manual_edit
to True which signals to neat to try to read in manually edited data model and join it into the provenance trail.
You can download wind-farm-data-model-manual-edited.xlsx
neat.read.excel("wind-farm-data-model-manual-edited.xlsx", enable_manual_edit=True)
[WARNING] Experimental feature 'enable_manual_edit' is subject to change without notice
Succeeded with warnings: Read NEAT(verified,physical,wind_energy_space,TinyWindFarmModel,v1)
count | |
---|---|
NeatIssue | |
NeatValueWarning | 18 |
NotNeatSupportedFilterWarning | 7 |
Hint: Use the .inspect.issues() for more details.
Finally let's push data model to CDF:
Let's inspect the outcome of data model deployement:
neat.inspect.outcome.data_model()
spaces¶
unchanged¶
- wind_energy_space
containers¶
unchanged¶
- ContainerId(space='wind_energy_space', external_id='Location')
- ContainerId(space='wind_energy_space', external_id='WindTurbine')
- ContainerId(space='wind_energy_space', external_id='CopyOfEquipment')
- ContainerId(space='wind_energy_space', external_id='MetMast')
- ContainerId(space='wind_energy_space', external_id='CopyOfTimeSeries')
- ContainerId(space='wind_energy_space', external_id='CopyOfAsset')
- ContainerId(space='wind_energy_space', external_id='WindFarm')
- ContainerId(space='wind_energy_space', external_id='CopyOfActivity')
- ContainerId(space='wind_energy_space', external_id='Substation')
views¶
unchanged¶
- ViewId(space='wind_energy_space', external_id='Location', version='v1')
- ViewId(space='wind_energy_space', external_id='WindTurbine', version='v1')
- ViewId(space='wind_energy_space', external_id='WindFarm', version='v1')
- ViewId(space='wind_energy_space', external_id='CopyOfEquipment', version='v1')
- ViewId(space='wind_energy_space', external_id='CopyOfAsset', version='v1')
- ViewId(space='wind_energy_space', external_id='Substation', version='v1')
- ViewId(space='wind_energy_space', external_id='CopyOfActivity', version='v1')
- ViewId(space='wind_energy_space', external_id='MetMast', version='v1')
- ViewId(space='wind_energy_space', external_id='CopyOfTimeSeries', version='v1')
data_models¶
unchanged¶
- DataModelId(space='wind_energy_space', external_id='TinyWindFarmModel', version='v1')
nodes¶
Let visualize a full provenance from the begining til the end:
neat.show.data_model.provenance()
data_model_provenance_c2bd65be.html