RDF Onboarding

RDF Onboarding¶

Prerequisite:

Access to a CDF Project.
Know how to install and setup Python.
Launch a Python notebook.

In this tutorial we will show how to onboard source data to CDF. We will:

read in Nordic44 data, containing instances of the nordic power system in the form of RDF triples.
infer the underlying data model
prepare data model to be CDF compliant
upload data model to CDF
populate data model with instances from the Nordic44 data

In [1]:

Copied!

from cognite.neat import NeatSession, get_cognite_client
from cognite.neat import NeatSession, get_cognite_client

In [2]:

Copied!

client = get_cognite_client(".env")
client = get_cognite_client(".env")

Found .env file in repository root. Loaded variables from .env file.

In [3]:

Copied!

neat = NeatSession(client, storage="oxigraph", verbose=True)
neat = NeatSession(client, storage="oxigraph", verbose=True)

Neat Engine 2.0.3 loaded.

We have already nordic44 as an example in neat and we can read it in as following:

In [4]:

Copied!

neat.read.examples.nordic44()
neat.read.examples.nordic44()

Out[4]:

No issues found

Let's inspect content of the NEAT session:

In [5]:

Copied!

neat
neat

Out[5]:

Instances

Overview:

1 named graphs
Total of 59 unique types
2713 instances in default graph

default graph:

	Type	Occurrence
0	CurrentLimit	530
1	Terminal	452
2	OperationalLimitSet	238
3	OperatingShare	207
4	VoltageLimit	184
...	...	...
58	FullModel	1

Provenance:

Initialize graph store as OxigraphStore
Extracted triples to named graph urn:x-rdflib:default using RdfFileExtractor

Neat provides you a way also to visualize instances:

In [6]:

Copied!

neat.show.instances()
neat.show.instances()

instances.html

Out[6]:

Let's now infer data model from the instances (this take with in-memory graph store ~ 20s).

The inference will produce un-validated logical data model (aka Information Rules), which we will later prepare for validation and later conversion to physical data model (aka DMS Rules)

In [7]:

Copied!

neat.infer()
neat.infer()

Out[7]:

	count
NeatIssue
ResourceRegexViolationWarning	333

Now if we inspect session you will beside instances overview also metadata about the inferred data model:

In [8]:

Copied!

neat
neat

Out[8]:

Data Model


type	Logical Data Model
intended for	Information Architect
name	Inferred Model
external_id	NeatInferredDataModel
version	v1
classes	59
properties	333

Instances

Overview:

1 named graphs
Total of 59 unique types
2713 instances in default graph

default graph:

	Type	Occurrence
0	CurrentLimit	530
1	Terminal	452
2	OperationalLimitSet	238
3	OperatingShare	207
4	VoltageLimit	184
...	...	...
58	FullModel	1

Provenance:

Initialize graph store as OxigraphStore
Extracted triples to named graph urn:x-rdflib:default using RdfFileExtractor

Since the inferred data model might contain external_ids of properties and objects which are not compliant with cdf we need to prepare it prior validation and conversion to cdf compliant data model. This can be done by running:

In [9]:

Copied!

neat.fix.data_model.cdf_compliant_external_ids()
neat.fix.data_model.cdf_compliant_external_ids()

Out[9]:

Success: NEAT(verified,logical,neat_space,NeatInferredDataModel,v1) → NEAT(verified,logical,neat_space,NeatInferredDataModel,v1)

Now let's convert the data model to be CDF complian data model (aka physical data model)

In [11]:

Copied!

neat.convert()
neat.convert()

Rules converted to dms.

Out[11]:

Success: NEAT(verified,logical,neat_space,NeatInferredDataModel,v1) → NEAT(verified,physical,neat_space,NeatInferredDataModel,v1)

In [12]:

Copied!

neat
neat

Out[12]:

Data Model


aspect	physical
intended for	DMS Architect
name	Inferred Model
space	neat_space
external_id	NeatInferredDataModel
version	v1
views	59
containers	59
properties	333

Instances

Overview:

1 named graphs
Total of 59 unique types
2713 instances in default graph

default graph:

	Type	Occurrence
0	CurrentLimit	530
1	Terminal	452
2	OperationalLimitSet	238
3	OperatingShare	207
4	VoltageLimit	184
...	...	...
58	FullModel	1

Provenance:

Initialize graph store as OxigraphStore
Extracted triples to named graph urn:x-rdflib:default using RdfFileExtractor
Added dict to urn:x-rdflib:default named graph
Upsert prefixes to the name graph {named_graph}
Added dict to urn:x-rdflib:default named graph
Upsert prefixes to the name graph {named_graph}

Now we have all the components to form knowledge graph in CDF, i.e. data model and instances. Let's upload them to CDF:

In [18]:

Copied!

neat.to.cdf.data_model()
neat.to.cdf.data_model()

Out[18]:

	name	created
0	spaces	1
1	containers	59
2	views	59
3	data_models	1

In [19]:

Copied!

neat.to.cdf.instances(space="inference_space")
neat.to.cdf.instances(space="inference_space")

Out[19]:

	name	created	changed
0	Nodes	1000.0	NaN
1	Edges	NaN	NaN
2	Nodes	737.0	263.0
3	Edges	NaN	NaN
4	Nodes	547.0	166.0
5	Edges	NaN	NaN

Results of the upload should be visible in the CDF project such as in the following screenshot:

No description has been provided for this image