Classic Knowledge Graph Onboarding¶
Prerequisite:
- Access to a CDF Project.
- Know how to install and setup Python.
- Launch a Python notebook.
In this tutorial, we will show you how to onboard data from CDF resources assets, timeseries, files, sequences, events, data sets, and labels into an extension of the Cognite Core data model.
Use Case¶
The use case we will use in this tutorial is a WindFarm named Utsira
with two wind turbines, WT-01
and WT-02
. These two turbines each has two time series, one for power production and one for forecast. In addition, one of them has a power curve sequences linked and a maintenance event. In addition, there is a whearther station modeled as an asset that is connected to the two turbines through relationships. Finally, there is a file with a data sheet linked to one of the turbines.
In the CDF UI, it looks like shown below:
Extracting Data¶
We will start by creating a new NeatSession
and connect to CDF
from cognite.neat import NeatSession, get_cognite_client
client = get_cognite_client(".env")
Found .env file in repository root. Loaded variables from .env file.
# We have install neat with `pip install "cognite-neat[oxi]"` such that we can use oxigraph for storage.
neat = NeatSession(client, storage="oxigraph")
Neat Engine 2.0.3 loaded.
We start by reading in the knowledge graph, we do this by pointing to the root node. Neat will then get all assets in the hierarchy, as well as all resources linked to this hierarchy.
neat.read.cdf.classic.graph("Utsira")
Extracting Asset relationships: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 18.51it/s] Extracting TimeSeries relationships: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 18.92it/s] Extracting Sequence relationships: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 20.36it/s] Extracting Event relationships: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 19.10it/s] Extracting File relationships: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 20.02it/s] Extracting end nodes Asset: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 18.49it/s] Extracting labels: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 8.17it/s] Extracting data sets: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 18.50it/s]
Success: Read Classic Graph
We can now inspect the knowledge graph (instances + data model) we have extracted.
neat
Verified Data Model
type | Logical Data Model |
---|---|
intended for | Information Architect |
name | Inferred Model |
external_id | ClassicDataModel |
version | v1 |
classes | 9 |
properties | 71 |
Instances
Overview:
- 9 types
- 22 instances
Type | Occurrence | |
---|---|---|
0 | ClassicAsset | 4 |
1 | ClassicTimeSeries | 4 |
2 | ClassicLabel | 4 |
3 | ClassicDataSet | 3 |
4 | ClassicSourceSystem | 2 |
... | ... | ... |
8 | ClassicEvent | 1 |
Provenance:
- Initialize graph store as OxigraphStore
- Extracted triples to graph store using ClassicGraphExtractor
- Extracted triples to graph store using ClassicGraphExtractor
- Extracted triples to graph store using ClassicGraphExtractor
- Extracted triples to graph store using ClassicGraphExtractor
- Extracted triples to graph store using ClassicGraphExtractor
- Extracted triples to graph store using ClassicGraphExtractor
- Extracted triples to graph store using ClassicGraphExtractor
- Extracted triples to graph store using ClassicGraphExtractor
- Added rules to graph store as InformationRules
- Upsert prefixes to graph store
- Lookup relationships source and target externalId
- ConvertLiteral is a transformer that improve data typing of a literal value.
- Converts a literal value to new entity
neat.show.data_model()
http_purl.org_cognite_neat_data-model_verified_logical_neat_space_ClassicDataModel_v1.html
Converting and Map to Core Data Model¶
The data model we get from reading in the knowledge graph is in the information format, i.e., it does not contain information about how it should be implmented. We convert it to DMS which is the implementation format, which we then map onto core. In the mapping to core, we change set the prefix of all the new views to our company name.
neat.convert("dms")
Rules converted to dms
Success: VerifiedInformationModel → VerifiedDMSModel
organiztion_name = "DoctrinoInc"
neat.mapping.data_model.classic_to_core(organiztion_name)
Succeeded with warnings: VerifiedDMSModel → VerifiedDMSModel
count | |
---|---|
NeatIssue | |
PropertyOverwritingWarning | 11 |
DefaultWarning | 9 |
Hint: Use the .inspect.issues() for more details.
We get some warnings as the for example the value types used in the classic are changed by the mapping to Cognite Core. Finally, we inspect the mapped model.
neat.show.data_model()
http_purl.org_cognite_neat_data-model_verified_physical_neat_space_ClassicDataModel_v1.html
Publishing Data Model¶
We can now publish the data model. First, we set the data model id
neat.set.data_model_id(("sp_doctrino_snapshot", "WindFarm", "v1"))
Success: VerifiedDMSModel → VerifiedDMSModel
neat.to.cdf.data_model()
You can inspect the details with the .inspect.outcome.data_model(...) method.
name | unchanged | created | changed | deleted | |
---|---|---|---|---|---|
0 | spaces | 1 | 0 | 0 | 0 |
1 | containers | 1 | 8 | 1 | 0 |
2 | views | 2 | 0 | 8 | 0 |
3 | data_models | 0 | 1 | 0 | 1 |
4 | nodes | 0 | 0 | 0 | 0 |
Populate new Model¶
neat.to.cdf.instances()
You can inspect the details with the .inspect.outcome.instances(...) method.
name | created | changed | unchanged | |
---|---|---|---|---|
0 | DoctrinoIncDataSet | 3 | 0 | 0 |
1 | DoctrinoIncLabel | 4 | 0 | 0 |
2 | DoctrinoIncRelationship | 2 | 0 | 0 |
3 | DoctrinoIncSequence | 1 | 0 | 0 |
4 | DoctrinoIncSourceSystem | 2 | 0 | 0 |
5 | DoctrinoIncAsset | 3 | 3 | 1 |
6 | 3 | 0 | 0 | |
7 | DoctrinoIncFile | 1 | 0 | 0 |
8 | DoctrinoIncTimeSeries | 4 | 0 | 0 |
9 | DoctrinoIncEvent | 1 | 0 | 0 |
(Optional) Create a Data Product Model¶
The new model will be very large as it includes 33 views from the CogniteCore. We will now make a read-only model called a Data Product that is smaller and only use the original views and properties we identified.
neat.prepare.data_model.to_data_product(("sp_doctrino_readonly", "WindFarmReadOnlny", "v1"))
And the publish this one
neat.to.cdf.data_model(existing="force")
You can inspect the details with the .inspect.outcome.data_model(...) method.
name | unchanged | changed | created | deleted | |
---|---|---|---|---|---|
0 | spaces | 1 | 0 | 0 | 0 |
1 | containers | 0 | 0 | 0 | 0 |
2 | views | 2 | 8 | 0 | 0 |
3 | data_models | 0 | 0 | 1 | 1 |
4 | nodes | 0 | 0 | 0 | 0 |