From Source to CDF - Nordic44 Use Case
From Source to CDF: Nordic44 Use-Case¶
Prerequisite:
- Access to a CDF Project.
- Know how to install and setup Python.
- Launch a Python notebook.
In this tutorial we will show how to onboard source data to CDF. We will:
- read in Nordic44 data, containing instances of the nordic power system in the form of RDF triples.
- infer the underlying data model
- prepare data model to be CDF compliant
- upload data model to CDF
- populate data model with instances from the Nordic44 data
from cognite.neat import NeatSession, get_cognite_client
client = get_cognite_client(".env")
Found .env file in repository root. Loaded variables from .env file.
neat = NeatSession(client, verbose=True)
We have already nordic44 as an example in neat and we can read it in as following:
neat.read.rdf.examples.nordic44
'No issues found'
Let's inspect content of the NEAT session:
neat
Instances
Overview:
- 59 types
- 2713 instances
Type | Occurrence | |
---|---|---|
0 | CurrentLimit | 530 |
1 | Terminal | 452 |
2 | OperationalLimitSet | 238 |
3 | OperatingShare | 207 |
4 | VoltageLimit | 184 |
... | ... | ... |
58 | PowerTransferCorridor | 1 |
Provenance:
- Initialize graph store as Memory
- Extracted triples to graph store using RdfFileExtractor
Neat provides you a way also to visualize instances:
neat.show.instances()
Max of 100 nodes and edges are displayed, which are randomly selected.
CytoscapeWidget(cytoscape_layout={'name': 'cola'}, cytoscape_style=[{'selector': 'edge', 'style': {'width': 1,…
Let's now infer data model from the instances (this take with in-memory graph store ~ 20s).
The inference will produce un-validated logical data model (aka Information Rules), which we will later prepare for validation and later conversion to physical data model (aka DMS Rules)
issues = neat.infer()
Data Model Inference <class 'cognite.neat._store._base.NeatGraphStore'> <cognite.neat._store._base.NeatGraphStore object at 0x118ec7d50> read successfully
Now if we inspect session you will beside instances overview also metadata about the inferred data model:
neat
Raw Data Model
0 | |
---|---|
dataModelType | enterprise |
schema | partial |
extension | addition |
prefix | inference_space |
namespace | http://purl.org/cognite/neat# |
title | InferredDataModel |
description | Inferred model from knowledge graph |
version | v1 |
created | 2024-10-28 16:55:20.096865 |
updated | 2024-10-28 16:55:20.096870 |
creator | NEAT |
license | None |
rights | None |
Instances
Overview:
- 59 types
- 2713 instances
Type | Occurrence | |
---|---|---|
0 | CurrentLimit | 530 |
1 | Terminal | 452 |
2 | OperationalLimitSet | 238 |
3 | OperatingShare | 207 |
4 | VoltageLimit | 184 |
... | ... | ... |
58 | PowerTransferCorridor | 1 |
Provenance:
- Initialize graph store as Memory
- Extracted triples to graph store using RdfFileExtractor
Since the inferred data model might contain external_ids of properties and objects which are not compliant with cdf we need to prepare it prior validation and conversion to cdf compliant data model. This can be done by running:
neat.prepare.data_model.cdf_compliant_external_ids()
neat.verify()
'No issues found'
After running verify we can see that we now have a verified data model in neat session:
neat
Data Model
0 | |
---|---|
role | information architect |
data_model_type | enterprise |
schema_ | partial |
extension | addition |
prefix | inference_space |
namespace | http://purl.org/cognite/neat# |
name | InferredDataModel |
description | Inferred model from knowledge graph |
version | v1 |
created | 2024-10-28 16:55:20.096865 |
updated | 2024-10-28 16:55:20.096870 |
creator | NEAT |
license | None |
rights | None |
Instances
Overview:
- 59 types
- 2713 instances
Type | Occurrence | |
---|---|---|
0 | CurrentLimit | 530 |
1 | Terminal | 452 |
2 | OperationalLimitSet | 238 |
3 | OperatingShare | 207 |
4 | VoltageLimit | 184 |
... | ... | ... |
58 | PowerTransferCorridor | 1 |
Provenance:
- Initialize graph store as Memory
- Extracted triples to graph store using RdfFileExtractor
- Added rules to graph store as InformationRules
- Upsert prefixes to graph store
Similar to the instances we can visualize the data model:
neat.show.data_model()
CytoscapeWidget(cytoscape_layout={'name': 'cola'}, cytoscape_style=[{'selector': 'node', 'css': {'content': 'd…
Now let's convert the verified data model to be CDF complian data model (aka physical data model)
neat.convert("dms")
Rules converted to dms
neat
Data Model
0 | |
---|---|
role | DMS Architect |
data_model_type | enterprise |
schema_ | partial |
extension | addition |
space | inference_space |
name | InferredDataModel |
description | None |
external_id | inferreddatamodel |
version | v1 |
creator | NEAT |
created | 2024-10-28 16:55:20.096865 |
updated | 2024-10-28 16:55:20.096870 |
Instances
Overview:
- 59 types
- 2713 instances
Type | Occurrence | |
---|---|---|
0 | CurrentLimit | 530 |
1 | Terminal | 452 |
2 | OperationalLimitSet | 238 |
3 | OperatingShare | 207 |
4 | VoltageLimit | 184 |
... | ... | ... |
58 | PowerTransferCorridor | 1 |
Provenance:
- Initialize graph store as Memory
- Extracted triples to graph store using RdfFileExtractor
- Added rules to graph store as InformationRules
- Upsert prefixes to graph store
Now we have all the components to form knowledge graph in CDF, i.e. data model and instances. Let's upload them to CDF:
neat.to.cdf.data_model()
name | created | |
---|---|---|
0 | spaces | 1 |
1 | containers | 59 |
2 | views | 59 |
3 | data_models | 1 |
neat.to.cdf.instances(space="inference_space")
name | created | changed | |
---|---|---|---|
0 | Nodes | 1000.0 | NaN |
1 | Edges | NaN | NaN |
2 | Nodes | 737.0 | 263.0 |
3 | Edges | NaN | NaN |
4 | Nodes | 547.0 | 166.0 |
5 | Edges | NaN | NaN |