Asset Hierarchy Migration¶
Prerequisite:
- Access to a CDF Project.
- Know how to install and setup Python.
- Launch a Python notebook.
In this tutorial, we will show you how to migrate an asset hierarchy to a data model representing the same hierarchy in CDF.
This tutorial is also a good demonstration of the width of capabilities for neat, going from extracting data, creating a data model, export the data model, and load the data.
Extract Data from Asset Hierarchy¶
We will start by extracting the data from an existing asset hierarchy.
The example we will use in this tutorial is an asset hierarchy for pumps, shown below in CDF's classic Data Exploration
from cognite.neat import get_cognite_client, NeatSession
We start by instansiating a new NeatSession
client = get_cognite_client(".env")
Found .env file in repository root. Loaded variables from .env file.
neat = NeatSession(client, storage="memory")
neat
neat.read.cdf.classic.assets(root_asset_external_id="lift_pump_stations:root")
Output()
Asset hierarchy lift_pump_stations:root read successfully
neat
Instances
Overview:
- 1 types
- 245 instances
Type | Occurrence | |
---|---|---|
0 | Asset | 245 |
Provenance:
- Initialize graph store as Memory
- Extracted triples to graph store using AssetsExtractor
Infering Data Model from Data¶
With the asset data in the store, we can now infer the data model
issues = neat.infer()
issues
Data Model Inference <class 'cognite.neat._store._base.NeatGraphStore'> <cognite.neat._store._base.NeatGraphStore object at 0x000001A27613CCE0> read successfully
identifier | resourceType | propertyName | defaultAction | recommendedAction | NeatIssue | |
---|---|---|---|---|---|---|
0 | Asset:dataset | Property | dataset | Remove the property from the rules | Make sure that graph is complete | PropertyValueTypeUndefinedWarning |
We see in the inference that there is a DataSet type we do not know about. However, this is just a warning we can continue.
neat.verify()
identifier | resourceType | propertyName | defaultAction | recommendedAction | NeatIssue | value | pattern | patternName | |
---|---|---|---|---|---|---|---|---|---|
0 | Asset:dataset | Property | dataset | Remove the property from the rules | Make sure that graph is complete | PropertyValueTypeUndefinedWarning | NaN | NaN | NaN |
1 | property | NaN | NaN | NaN | NaN | RegexViolationWarning | Shape__Length | ^(\*)|(?!^(Property|property)$)(^[a-zA-Z][a-zA... | MoreThanOneNonAlphanumeric |
First we check if there were any issues with creating the rules, and we find one warning. This warning
is that there is a property, Shape__Length
, with a double underscore which is not recommended. However,
we can continue.
We can inspect the properies that has been inferred.
neat.inspect.properties
class_ | property_ | value_type | max_count | reference | transformation | comment | |
---|---|---|---|---|---|---|---|
0 | inference_space:Asset | name | string | 1 | http://purl.org/cognite/neat#name | inferred:Asset(inferred:name) | Class <Asset> has property <name> with value t... |
1 | inference_space:Asset | external_id | string | 1 | http://purl.org/cognite/neat#external_id | inferred:Asset(inferred:external_id) | Class <Asset> has property <external_id> with ... |
2 | inference_space:Asset | created_time | dateTime | 1 | http://purl.org/cognite/neat#created_time | inferred:Asset(inferred:created_time) | Class <Asset> has property <created_time> with... |
3 | inference_space:Asset | last_updated_time | dateTime | 1 | http://purl.org/cognite/neat#last_updated_time | inferred:Asset(inferred:last_updated_time) | Class <Asset> has property <last_updated_time>... |
4 | inference_space:Asset | parent | inference_space:Asset | 1 | http://purl.org/cognite/neat#parent | inferred:Asset(inferred:parent) | Class <Asset> has property <parent> with value... |
5 | inference_space:Asset | root | inference_space:Asset | 1 | http://purl.org/cognite/neat#root | inferred:Asset(inferred:root) | Class <Asset> has property <root> with value t... |
6 | inference_space:Asset | dataset | anyURI | 1 | http://purl.org/cognite/neat#dataset | inferred:Asset(inferred:dataset) | Class <Asset> has property <dataset> with valu... |
7 | inference_space:Asset | description | string | 1 | http://purl.org/cognite/neat#description | inferred:Asset(inferred:description) | Class <Asset> has property <description> with ... |
8 | inference_space:Asset | DesignPointFlowGPM | double | 1 | http://purl.org/cognite/neat#DesignPointFlowGPM | inferred:Asset(inferred:DesignPointFlowGPM) | Class <Asset> has property <DesignPointFlowGPM... |
9 | inference_space:Asset | DesignPointHeadFT | double | 1 | http://purl.org/cognite/neat#DesignPointHeadFT | inferred:Asset(inferred:DesignPointHeadFT) | Class <Asset> has property <DesignPointHeadFT>... |
10 | inference_space:Asset | Enabled | double | 1 | http://purl.org/cognite/neat#Enabled | inferred:Asset(inferred:Enabled) | Class <Asset> has property <Enabled> with valu... |
11 | inference_space:Asset | FacilityID | string | 1 | http://purl.org/cognite/neat#FacilityID | inferred:Asset(inferred:FacilityID) | Class <Asset> has property <FacilityID> with v... |
12 | inference_space:Asset | HighHeadShutOff | double | 1 | http://purl.org/cognite/neat#HighHeadShutOff | inferred:Asset(inferred:HighHeadShutOff) | Class <Asset> has property <HighHeadShutOff> w... |
13 | inference_space:Asset | InstallDate | string | 1 | http://purl.org/cognite/neat#InstallDate | inferred:Asset(inferred:InstallDate) | Class <Asset> has property <InstallDate> with ... |
14 | inference_space:Asset | LifeCycleStatus | string | 1 | http://purl.org/cognite/neat#LifeCycleStatus | inferred:Asset(inferred:LifeCycleStatus) | Class <Asset> has property <LifeCycleStatus> w... |
15 | inference_space:Asset | LiftStationID | string | 1 | http://purl.org/cognite/neat#LiftStationID | inferred:Asset(inferred:LiftStationID) | Class <Asset> has property <LiftStationID> wit... |
16 | inference_space:Asset | LocationDescription | string | 1 | http://purl.org/cognite/neat#LocationDescription | inferred:Asset(inferred:LocationDescription) | Class <Asset> has property <LocationDescriptio... |
17 | inference_space:Asset | LowHeadFT | double | 1 | http://purl.org/cognite/neat#LowHeadFT | inferred:Asset(inferred:LowHeadFT) | Class <Asset> has property <LowHeadFT> with va... |
18 | inference_space:Asset | LowHeadFlowGPM | double | 1 | http://purl.org/cognite/neat#LowHeadFlowGPM | inferred:Asset(inferred:LowHeadFlowGPM) | Class <Asset> has property <LowHeadFlowGPM> wi... |
19 | inference_space:Asset | Position | string | 1 | http://purl.org/cognite/neat#Position | inferred:Asset(inferred:Position) | Class <Asset> has property <Position> with val... |
20 | inference_space:Asset | PumpHP | double | 1 | http://purl.org/cognite/neat#PumpHP | inferred:Asset(inferred:PumpHP) | Class <Asset> has property <PumpHP> with value... |
21 | inference_space:Asset | PumpModel | integer | string | 1 | http://purl.org/cognite/neat#PumpModel | inferred:Asset(inferred:PumpModel) | Class <Asset> has property <PumpModel> with va... |
22 | inference_space:Asset | PumpOff | double | 1 | http://purl.org/cognite/neat#PumpOff | inferred:Asset(inferred:PumpOff) | Class <Asset> has property <PumpOff> with valu... |
23 | inference_space:Asset | PumpOn | double | 1 | http://purl.org/cognite/neat#PumpOn | inferred:Asset(inferred:PumpOn) | Class <Asset> has property <PumpOn> with value... |
24 | inference_space:Asset | Shape__Length | double | 1 | http://purl.org/cognite/neat#Shape__Length | inferred:Asset(inferred:Shape__Length) | Class <Asset> has property <Shape__Length> wit... |
25 | inference_space:Asset | VFD | string | 1 | http://purl.org/cognite/neat#VFD | inferred:Asset(inferred:VFD) | Class <Asset> has property <VFD> with value ty... |
We notice that for example the PumpModel
is both an integer and a string, as the Inference
found data of both types.
We can inspect the comment from the Inference
type:
neat.inspect.properties.loc[21, "comment"]
'Class <Asset> has property <PumpModel> with value types <string> which occurs <140> times and <integer> which occurs <6> times in the graph'
And we see that this is most likely a string as that occured much more for this field in the graph than the integer.
Exporting Data Model¶
Lets export our newly created data model to CDF. First, we need to convert it to an phsycial format.
neat.convert("dms")
Rules converted to dms
C:\Users\AndersAlbert\AppData\Local\pypoetry\Cache\virtualenvs\cognite-neat-WszCo0Uu-py3.12\Lib\site-packages\pydantic\main.py:212: RegexViolationWarning: ('Shape__Length', '(?!^(property|space|externalId|createdTime|lastUpdatedTime|deletedTime|edge_id|node_id|project_id|property_group|seq|tg_table_name|extensions)$)(^[a-zA-Z][a-zA-Z0-9_]{0,253}[a-zA-Z0-9]?$)', 'property', 'MoreThanOneNonAlphanumeric') validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self) C:\Users\AndersAlbert\AppData\Local\pypoetry\Cache\virtualenvs\cognite-neat-WszCo0Uu-py3.12\Lib\site-packages\pydantic\main.py:212: RegexViolationWarning: ('Shape__Length', '^(\\*)|(?!^(Property|property)$)(^[a-zA-Z][a-zA-Z0-9._-]{0,253}[a-zA-Z0-9]?$)', 'property', 'MoreThanOneNonAlphanumeric') validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
neat
Data Model
type | Physical Data Model |
---|---|
intended for | DMS Architect |
name | InferredDataModel |
space | inference_space |
external_id | inferreddatamodel |
version | v1 |
views | 1 |
containers | 1 |
properties | 26 |
Instances
Overview:
- 1 types
- 245 instances
Type | Occurrence | |
---|---|---|
0 | Asset | 245 |
Provenance:
- Initialize graph store as Memory
- Extracted triples to graph store using AssetsExtractor
- Added rules to graph store as InformationRules
- Upsert prefixes to graph store
neat.to.cdf.data_model()
name | created | |
---|---|---|
0 | spaces | 1 |
1 | containers | 1 |
2 | views | 1 |
3 | data_models | 1 |
We see the data model was succesfully created.
Populating Data Model¶
As the data model is ready, we can move the instances to CDF
First, we ensure the instance space exists.
from cognite.client import data_modeling as dm
created = client.data_modeling.spaces.apply(dm.SpaceApply("sp_pump_station"))
created
value | |
---|---|
space | sp_pump_station |
is_global | False |
last_updated_time | 2024-07-09 05:31:00.944000 |
created_time | 2024-07-09 05:31:00.944000 |
We can now use the loader to populate the data model in CDF.
neat.to.cdf.instances(created.space)
name | changed | |
---|---|---|
0 | Nodes | 245.0 |
1 | Edges | NaN |
As we see from the result above, Neat has created 245 Nodes in the new data model.
Results¶
Final Remarks¶
- In this tutorial, we used the in-memory version of the Neat store. This works well for small examples, like the toy example here, but for larger asset hierarchies we likely need to use a faster triple store such as
GraphDB
orOxigraph
. These are also available in Neat, but require extra dependencies. - This can be considered the first step of a full migration. At least two related problems may remain
- First, we might want to infer a more specific type than
Asset
, for example,Pump
andLiftStation
. This means adding information that is not explicitly set in the existing Asset Hierarchy. The type might be implicitly defined from the level in the hierarchy, or for example, the external ID of the asset. See part 2 for an example of how to add type in the migration process. - We might want to map the inferred model onto an existing data model. It this case the existing model would be an
EnterpriseModel
and the inferred model we obtained here would be aSource
model.
- First, we might want to infer a more specific type than