Asset Hierarchy Migration¶
Prerequisite:
- Access to a CDF Project.
- Know how to install and setup Python.
- Launch a Python notebook.
In this tutorial, we will show you how to migrate an asset hierarchy to a data model representing the same hierarchy in CDF.
This tutorial is also a good demonstration of the width of capabilities for neat, going from extracting data, creating a data model, export the data model, and load the data.
Extract Data from Asset Hierarchy¶
We will start by extracting the data from an existing asset hierarchy.
The example we will use in this tutorial is an asset hierarchy for pumps, shown below in CDF's classic Data Exploration
from cognite.neat import get_cognite_client
from cognite.neat.graph import extractors, NeatGraphStore
We start by importing the neat extractors, NeatGraphStore, and a utility method for getting a neat client.
store = NeatGraphStore.from_memory_store()
The NeatGraphStore is a triple store that can store both data as well as schemas/data-models.
This store we can populate by extracting from one of the supported sources for neat.
To see which extractors are available we print the extractors
module.
extractors
Extractor | Description | |
---|---|---|
0 | AssetsExtractor | Extract data from Cognite Data Fusions Assets ... |
1 | MockGraphGenerator | Class used to generate mock graph data for pur... |
2 | RelationshipsExtractor | Extract data from Cognite Data Fusions Relatio... |
3 | TimeSeriesExtractor | Extract data from Cognite Data Fusions TimeSer... |
4 | SequencesExtractor | Extract data from Cognite Data Fusions Sequenc... |
5 | EventsExtractor | Extract data from Cognite Data Fusions Events ... |
6 | FilesExtractor | Extract data from Cognite Data Fusions files m... |
7 | LabelsExtractor | Extract data from Cognite Data Fusions Labels ... |
8 | RdfFileExtractor | Extract data from RDF files into Neat. |
9 | DexpiExtractor | DEXPI-XML extractor of RDF triples |
At the top of the list we see the AssetsExtractor which is what we are looking for
extractors.AssetsExtractor
AssetsExtractor
Extract data from Cognite Data Fusions Assets into Neat.
Available factory methods:
- .from_dataset
- .from_file
- .from_hierarchy
We see that there is a .from_hierarchy
fatory method that fits well with what we want to do
client = get_cognite_client()
Found .env file in repository root. Loaded variables from .env file.
Note that the utility function get_cognite_client
will prompt us for credentials if it doesn't find a .env
file in the repo root.
asset_extractor = extractors.AssetsExtractor.from_hierarchy(client, root_asset_external_id="lift_pump_stations:root")
The asset extractor is now ready and can be written to the store.
store.write(asset_extractor)
We can inspect the store to see what changes have been applied to it:
store
Provenance Provenance is a record of changes that have occurred in the graph store.
Agent | Activity | Entity | Description | |
---|---|---|---|---|
0 | http://purl.org/cognite/neat#agent | http://purl.org/cognite/neat#activity-b2c2ee23... | http://purl.org/cognite/neat#graph-store | Initialize graph store as Memory |
1 | http://purl.org/cognite/neat#agent | http://purl.org/cognite/neat#activity-b2c2ee23... | http://purl.org/cognite/neat#graph-store | Extracted triples to graph store using AssetsE... |
Infering Data Model from Data¶
To infer a data model from the data we will use an importer
from cognite.neat.rules import importers
importers
Importer | Description | |
---|---|---|
0 | OWLImporter | Convert OWL ontology to tables/ transformation... |
1 | DMSImporter | Imports a Data Model from Cognite Data Fusion. |
2 | ExcelImporter | Import rules from an Excel file. |
3 | GoogleSheetImporter | Import rules from a Google Sheet. |
4 | DTDLImporter | Importer from Azure Digital Twin - DTDL (Digit... |
5 | YAMLImporter | Imports the rules from a YAML file. |
6 | InferenceImporter | Infers rules from a triple store. |
We see that there are multiple importers availabe, but we will use the InferenceImporter
importers.InferenceImporter
InferenceImporter
Infers rules from a triple store.
Rules inference through analysis of knowledge graph provided in various formats.
Use the factory methods to create an triples store from sources such as
RDF files, JSON files, YAML files, XML files, or directly from a graph store.
Available factory methods:
- .from_graph_store
- .from_json_file
- .from_rdf_file
- .from_xml_file
- .from_yaml_file
importer = importers.InferenceImporter.from_graph_store(store)
Then we use the .to_rules
method to convert the data into Rules
which is Neat
's format for data models.
rules, issues = importer.to_rules()
First we check if there were any issues with creating the rules, and we find one warning. This warning
is that there is a property, Shape__Length
, with a double underscore which is not recommended. However,
wecan continue.
issues
field_name | value | |
---|---|---|
0 | property | Shape__Length |
Then, we inspect the classes found along with the properties
rules.classes
class_ | reference | match_type | comment | |
---|---|---|---|---|
0 | inferred:Asset | http://purl.org/cognite/neat#Asset | exact | Inferred from knowledge graph, where this clas... |
rules.properties
class_ | property_ | value_type | max_count | reference | transformation | comment | inherited | |
---|---|---|---|---|---|---|---|---|
0 | inferred:Asset | name | string | 1 | http://purl.org/cognite/neat#name | inferred:Asset(inferred:name) | Class <Asset> has property <name> with value t... | False |
1 | inferred:Asset | external_id | string | 1 | http://purl.org/cognite/neat#external_id | inferred:Asset(inferred:external_id) | Class <Asset> has property <external_id> with ... | False |
2 | inferred:Asset | created_time | dateTime | 1 | http://purl.org/cognite/neat#created_time | inferred:Asset(inferred:created_time) | Class <Asset> has property <created_time> with... | False |
3 | inferred:Asset | last_updated_time | dateTime | 1 | http://purl.org/cognite/neat#last_updated_time | inferred:Asset(inferred:last_updated_time) | Class <Asset> has property <last_updated_time>... | False |
4 | inferred:Asset | DesignPointFlowGPM | double | 1 | http://purl.org/cognite/neat#DesignPointFlowGPM | inferred:Asset(inferred:DesignPointFlowGPM) | Class <Asset> has property <DesignPointFlowGPM... | False |
5 | inferred:Asset | DesignPointHeadFT | double | 1 | http://purl.org/cognite/neat#DesignPointHeadFT | inferred:Asset(inferred:DesignPointHeadFT) | Class <Asset> has property <DesignPointHeadFT>... | False |
6 | inferred:Asset | Enabled | double | 1 | http://purl.org/cognite/neat#Enabled | inferred:Asset(inferred:Enabled) | Class <Asset> has property <Enabled> with valu... | False |
7 | inferred:Asset | HighHeadShutOff | double | 1 | http://purl.org/cognite/neat#HighHeadShutOff | inferred:Asset(inferred:HighHeadShutOff) | Class <Asset> has property <HighHeadShutOff> w... | False |
8 | inferred:Asset | LowHeadFT | double | 1 | http://purl.org/cognite/neat#LowHeadFT | inferred:Asset(inferred:LowHeadFT) | Class <Asset> has property <LowHeadFT> with va... | False |
9 | inferred:Asset | LowHeadFlowGPM | double | 1 | http://purl.org/cognite/neat#LowHeadFlowGPM | inferred:Asset(inferred:LowHeadFlowGPM) | Class <Asset> has property <LowHeadFlowGPM> wi... | False |
10 | inferred:Asset | PumpHP | double | 1 | http://purl.org/cognite/neat#PumpHP | inferred:Asset(inferred:PumpHP) | Class <Asset> has property <PumpHP> with value... | False |
11 | inferred:Asset | PumpOff | double | 1 | http://purl.org/cognite/neat#PumpOff | inferred:Asset(inferred:PumpOff) | Class <Asset> has property <PumpOff> with valu... | False |
12 | inferred:Asset | PumpOn | double | 1 | http://purl.org/cognite/neat#PumpOn | inferred:Asset(inferred:PumpOn) | Class <Asset> has property <PumpOn> with value... | False |
13 | inferred:Asset | Shape__Length | double | 1 | http://purl.org/cognite/neat#Shape__Length | inferred:Asset(inferred:Shape__Length) | Class <Asset> has property <Shape__Length> wit... | False |
14 | inferred:Asset | VFDSetting | double | 1 | http://purl.org/cognite/neat#VFDSetting | inferred:Asset(inferred:VFDSetting) | Class <Asset> has property <VFDSetting> with v... | False |
15 | inferred:Asset | parent | inferred:Asset | 1 | http://purl.org/cognite/neat#parent | inferred:Asset(inferred:parent) | Class <Asset> has property <parent> with value... | False |
16 | inferred:Asset | root | inferred:Asset | 1 | http://purl.org/cognite/neat#root | inferred:Asset(inferred:root) | Class <Asset> has property <root> with value t... | False |
17 | inferred:Asset | description | string | 1 | http://purl.org/cognite/neat#description | inferred:Asset(inferred:description) | Class <Asset> has property <description> with ... | False |
18 | inferred:Asset | FacilityID | string | 1 | http://purl.org/cognite/neat#FacilityID | inferred:Asset(inferred:FacilityID) | Class <Asset> has property <FacilityID> with v... | False |
19 | inferred:Asset | InstallDate | string | 1 | http://purl.org/cognite/neat#InstallDate | inferred:Asset(inferred:InstallDate) | Class <Asset> has property <InstallDate> with ... | False |
20 | inferred:Asset | LifeCycleStatus | string | 1 | http://purl.org/cognite/neat#LifeCycleStatus | inferred:Asset(inferred:LifeCycleStatus) | Class <Asset> has property <LifeCycleStatus> w... | False |
21 | inferred:Asset | LiftStationID | string | 1 | http://purl.org/cognite/neat#LiftStationID | inferred:Asset(inferred:LiftStationID) | Class <Asset> has property <LiftStationID> wit... | False |
22 | inferred:Asset | LocationDescription | string | 1 | http://purl.org/cognite/neat#LocationDescription | inferred:Asset(inferred:LocationDescription) | Class <Asset> has property <LocationDescriptio... | False |
23 | inferred:Asset | Position | string | 1 | http://purl.org/cognite/neat#Position | inferred:Asset(inferred:Position) | Class <Asset> has property <Position> with val... | False |
24 | inferred:Asset | PumpModel | integer | string | 1 | http://purl.org/cognite/neat#PumpModel | inferred:Asset(inferred:PumpModel) | Class <Asset> has property <PumpModel> with va... | False |
25 | inferred:Asset | VFD | string | 1 | http://purl.org/cognite/neat#VFD | inferred:Asset(inferred:VFD) | Class <Asset> has property <VFD> with value ty... | False |
We notice that for example the PumpModel
is both an integer and a string, as the Inference
found data of both types.
We can inspect the comment from the Inference
type:
rules.properties.data[24].comment
'Class <Asset> has property <PumpModel> with value type <integer> which occurs <6> times in the graph, with value type <string> which occurs <140> times in the graph'
And we see that this is most likely a string as that occured much more for this field in the graph than the integer.
Exporting Data Model¶
Lets export our newly created data model to CDF
from cognite.neat.rules import exporters
exporters
Exporter | Description | |
---|---|---|
0 | DMSExporter | Export rules to Cognite Data Fusion's Data Mod... |
1 | SemanticDataModelExporter | Exports rules to a semantic data model. |
2 | OWLExporter | Exports rules to an OWL ontology. |
3 | SHACLExporter | Exports rules to a SHACL graph. |
4 | ExcelExporter | Export rules to Excel. |
5 | YAMLExporter | Export rules to YAML. |
To export the data model we use the DMSExporter
.
exporters.DMSExporter
DMSExporter
Export rules to Cognite Data Fusion's Data Model Storage (DMS) service.
To export the rules, we need them in the DMS format, however, the rules
we have are in Information format.
type(rules)
cognite.neat.rules.models.information._rules.InformationRules
The information rules is used to model the information, while the DMS format is one of the implementation formats that Neat supports.
Neat has an out-of-the box conversion from Information to DMS formats, however, it does not, for example, set indexes.
dms_rules = rules.as_dms_rules()
C:\Users\AndersAlbert\AppData\Local\pypoetry\Cache\virtualenvs\cognite-neat-WszCo0Uu-py3.12\Lib\site-packages\pydantic\main.py:176: MoreThanOneNonAlphanumericCharacterWarning: ('property', 'Shape__Length') self.__pydantic_validator__.validate_python(data, self_instance=self)
dms_rules.metadata
value | |
---|---|
role | DMS Architect |
data_model_type | enterprise |
schema_ | partial |
extension | addition |
space | inferred |
name | Inferred Model |
description | None |
external_id | inferred_model |
version | inferred |
creator | NEAT |
created | 2024-07-09 07:30:24.429571 |
updated | 2024-07-09 07:30:24.429571 |
dms_rules.views
view | reference | in_model | class_ | |
---|---|---|---|---|
0 | inferred:Asset(version=inferred) | http://purl.org/cognite/neat#Asset | True | inferred:Asset |
dms_rules.containers
container | class_ | |
---|---|---|
0 | inferred:Asset | inferred:Asset |
If you want to modify the DMS rules it is recommented that you export them using the ExcelExporter
, modify the resulting spreadsheet
and import it using the ExcelImporter
. For this tutorial, we are happy with the out-of-the-box DMS rules, so we just pass
the InformationRules
into the DMS exporter which will automatically do the conversion.
exporter = exporters.DMSExporter()
result = exporter.export_to_cdf(rules, client)
C:\Users\AndersAlbert\AppData\Local\pypoetry\Cache\virtualenvs\cognite-neat-WszCo0Uu-py3.12\Lib\site-packages\pydantic\main.py:176: MoreThanOneNonAlphanumericCharacterWarning: ('property', 'Shape__Length') self.__pydantic_validator__.validate_python(data, self_instance=self)
result
name | created | |
---|---|---|
0 | spaces | 1 |
1 | containers | 1 |
2 | views | 1 |
3 | data_models | 1 |
We see the data model was succesfully created.
Populating Data Model¶
To populate the data model in CDF, we use a loader.
from cognite.neat.graph import loaders
loaders
Loader | Description | |
---|---|---|
0 | DMSLoader | Load data from Cognite Data Fusions Data Model... |
loaders.DMSLoader
DMSLoader
Load data from Cognite Data Fusions Data Modeling Service (DMS) into Neat.
Available factory methods:
- .from_data_model_id
- .from_rules
To load the data from the graph store, we add the rules to the store
store.add_rules(rules)
store
Provenance Provenance is a record of changes that have occurred in the graph store.
Agent | Activity | Entity | Description | |
---|---|---|---|---|
0 | http://purl.org/cognite/neat#agent | http://purl.org/cognite/neat#activity-b2c2ee23... | http://purl.org/cognite/neat#graph-store | Initialize graph store as Memory |
1 | http://purl.org/cognite/neat#agent | http://purl.org/cognite/neat#activity-b2c2ee23... | http://purl.org/cognite/neat#graph-store | Extracted triples to graph store using AssetsE... |
2 | http://purl.org/cognite/neat#agent | http://purl.org/cognite/neat#activity-b2c2ee23... | http://purl.org/cognite/neat#graph-store | Added rules to graph store as InformationRules |
3 | http://purl.org/cognite/neat#agent | http://purl.org/cognite/neat#activity-b2c2ee23... | http://purl.org/cognite/neat#graph-store | Upsert prefixes to graph store |
This is necessary for the store to be ready to load data to an extrenal system. Note that by adding the rules
to the store
the prefixes has been updated to match the rules object. This is how the loader knows which triples to fetch from the store.
First, we ensure the instance space exists.
from cognite.client import data_modeling as dm
created = client.data_modeling.spaces.apply(dm.SpaceApply("sp_pump_station"))
created
value | |
---|---|
space | sp_pump_station |
is_global | False |
last_updated_time | 2024-07-09 05:31:00.944000 |
created_time | 2024-07-09 05:31:00.944000 |
We can now use the loader to populate the data model in CDF.
Note that the DMSLoader
requires that we have the DMSRules
format, so we pass in the DMS rules we created above.
Alternatively, we can create the loader by passing in a data model ID.
loader = loaders.DMSLoader.from_rules(dms_rules, store, instance_space="sp_pump_station")
result = loader.load_into_cdf(client)
result
name | created | |
---|---|---|
0 | Nodes | 245.0 |
1 | Edges | NaN |
As we see from the result above, Neat has created 245 Nodes in the new data model.
Results¶
Final Remarks¶
- In this tutorial, we used the in-memory version of the Neat store. This works well for small examples, like the toy example here, but for larger asset hierarchies we likely need to use a faster triple store such as
GraphDB
orOxigraph
. These are also available in Neat, but require extra dependencies. - This can be considered the first step of a full migration. At least two related problems may remain
- First, we might want to infer a more specific type than
Asset
, for example,Pump
andLiftStation
. This means adding information that is not explicitly set in the existing Asset Hierarchy. The type might be implicitly defined from the level in the hierarchy, or for example, the external ID of the asset. See part 2 for an example of how to add type in the migration process. - We might want to map the inferred model onto an existing data model. It this case the existing model would be an
EnterpriseModel
and the inferred model we obtained here would be aSource
model.
- First, we might want to infer a more specific type than