CSV Onboarding¶
Prerequisite:
- Installed Neat, see Installation
- Launched a notebook environment.
- Familiar with the
NeatSession
object, see introduction - Access to
NeatEngine
In this tutorial, we will load data from a csv
, infer a data model from the data and push the model with data to CDF.
Reading Metadata¶
We will start by instansiating a NeatSession
and read the data from an URL.
from cognite.neat import NeatSession, get_cognite_client
neat = NeatSession(get_cognite_client(".env"))
Found .env file in repository root. Loaded variables from .env file. Neat Engine 2.0.3 loaded.
A snippet of the data we are reading are shown below
csv
ELC_STATUS_ID,RES_ID,SOURCE_DB,SOURCE_TABLE,WMT_AREA_ID,WMT_CATEGORY_ID,WMT_CONTRACTOR_ID,WMT_FUNC_CODE_ID,WMT_LOCATION_ID,WMT_PO_ID,WMT_SAFETYCRITICALELEMENT_ID,WMT_SYSTEM_ID,WMT_TAG_CREATED_DATE,WMT_TAG_CRITICALLINE,WMT_TAG_DESC,WMT_TAG_GLOBALID,WMT_TAG_HISTORYREQUIRED,WMT_TAG_ID,WMT_TAG_ID_ANCESTOR,WMT_TAG_ISACTIVE,WMT_TAG_ISOWNEDBYPROJECT,WMT_TAG_LOOP,WMT_TAG_MAINID,WMT_TAG_NAME,WMT_TAG_UPDATED_BY,WMT_TAG_UPDATED_DATE,WMT_TAG_STATUSCHGDATE,WMT_TAG_COMMENT,WMT_TAG_SUFFIX,WMT_SYSTEM_ACTIVE,WMT_SYSTEM_CODE,WMT_SYSTEM_DESC,WMT_SYSTEM_NAME,WMT_LOCATION_ACTIVE,WMT_LOCATION_CODE,WMT_LOCATION_EXTENDACTIVEWOP,WMT_LOCATION_EXTERNALOWNERSHIP,WMT_LOCATION_MAITIS,WMT_LOCATION_NAME,WMT_LOCATION_NOCOPIESDEFAULTIC,WMT_LOCATION_NOCOPIESWOPERMIT,WMT_LOCATION_OPERATIONHOURS,WMT_LOCATION_PROGVALUE,WMT_LOCATION_SIMULATETIMEFRAME,WMT_LOCATION_SJAMAXNOOFTASKS,WMT_LOCATION_USEPLOTALTITUDE,WMT_LOCATION_WORKSTART,latestUpdateTimeSource
1211,525283,workmate,wmate_dba.wmt_tag,1600,1116,1686,4564,1004,8309,1060,4440,26/06/2009 15:36,N,VRD - PH 1STSTGGEAR THRUST BRG OUT,1000000000681024,Y,346434,345637,1,0,96116,681760,23-TE-96116-04,8137,11/07/2014 09:25,,,,,,,,,,,,,,,,,,,,,,
1211,532924,workmate,wmate_dba.wmt_tag,1600,1116,1686,4564,1004,8309,1060,4440,26/06/2009 15:36,N,VRD - PH 1STSTG COMP SEAL GAS HTR,1000000000682252,Y,346452,346633,1,0,96148,681760,23-TE-96148,8137,11/07/2014 09:25,,,,,,,,,,,,,,,,,,,,,,
1211,446683,workmate,wmate_dba.wmt_tag,1600,1116,1686,4627,1004,8309,1060,4440,26/06/2009 15:36,N,VRD - PH 1STSTGGEAR 1 JOURNBRG DE,1000000000715794,Y,346995,346935,1,0,96117,681760,23-YT-96117-01,9802,09/12/2013 12:53,,,,,,,,,,,,,,,,,,,,,,
1211,,workmate,wmate_dba.wmt_tag,1600,1152,1686,11275,1004,,,4440,13/12/2012 14:13,N,SOFT TAG VRD - PH 1STSTG PRIM SEAL LEAK DE,1000000000250739,Y,682956,345868,1,0,,681760,23-FI-96151,1001,09/10/2015 11:56,06/10/2014 07:45,,,,,,,,,,,,,,,,,,,,,
To read a csv
, we need to tell neat what typ of data is in the source, as well as which column is the identifier.
We know that this data contains assets and that the column WMT_TAG_GLOBALID
is the unique identifier of these assets.
url = "https://apps-cdn.cogniteapp.com/toolkit/publicdata/assets.Table.csv"
neat.read.csv(url, type="Asset", primary_key="WMT_TAG_GLOBALID")
neat
Instances
Overview:
- 1 types
- 1103 instances
Type | Occurrence | |
---|---|---|
0 | Asset | 1103 |
Provenance:
- Initialize graph store as Memory
- Extracted triples to graph store using CSVExtractor
Studying the output above, we see that we succesfully read 1103 assets into the NeatSession
.
Infer Data Model¶
We can infer a data model from data in the NeatSession
by calling .infer()
.
neat.infer()
Success: Imported UnverifiedInformationModel
neat
Unverified Data Model
type | Logical Data Model |
---|---|
intended for | Information Architect |
name | Inferred Model |
external_id | NeatInferredDataModel |
space | neat_space |
version | v1 |
classes | 1 |
properties | 29 |
Instances
Overview:
- 1 types
- 1103 instances
Type | Occurrence | |
---|---|---|
0 | Asset | 1103 |
Provenance:
- Initialize graph store as Memory
- Extracted triples to graph store using CSVExtractor
This gives us an unverified data model, which we can then verify.
Verify Data Model¶
neat.verify()
Success: UnverifiedInformationModel → VerifiedInformationModel
neat.inspect.properties
neatId | class_ | property_ | value_type | max_count | transformation | |
---|---|---|---|---|---|---|
0 | http://purl.org/cognite/neat/neatId_27065821_b... | neat_space:Asset | WMT_TAG_MAINID | long | 1 | prefix_1:Asset(prefix_1:WMT_TAG_MAINID) |
1 | http://purl.org/cognite/neat/neatId_9cf70305_2... | neat_space:Asset | ELC_STATUS_ID | long | 1 | prefix_1:Asset(prefix_1:ELC_STATUS_ID) |
2 | http://purl.org/cognite/neat/neatId_bcad3c67_7... | neat_space:Asset | SOURCE_TABLE | string | 1 | prefix_1:Asset(prefix_1:SOURCE_TABLE) |
3 | http://purl.org/cognite/neat/neatId_92d7751d_0... | neat_space:Asset | SOURCE_DB | string | 1 | prefix_1:Asset(prefix_1:SOURCE_DB) |
4 | http://purl.org/cognite/neat/neatId_e9b68652_9... | neat_space:Asset | WMT_TAG_CRITICALLINE | string | 1 | prefix_1:Asset(prefix_1:WMT_TAG_CRITICALLINE) |
5 | http://purl.org/cognite/neat/neatId_e6a10869_4... | neat_space:Asset | WMT_TAG_HISTORYREQUIRED | string | 1 | prefix_1:Asset(prefix_1:WMT_TAG_HISTORYREQUIRED) |
6 | http://purl.org/cognite/neat/neatId_c25750d7_7... | neat_space:Asset | WMT_TAG_ISACTIVE | long | 1 | prefix_1:Asset(prefix_1:WMT_TAG_ISACTIVE) |
7 | http://purl.org/cognite/neat/neatId_2e47d057_b... | neat_space:Asset | WMT_TAG_CREATED_DATE | dateTime | 1 | prefix_1:Asset(prefix_1:WMT_TAG_CREATED_DATE) |
8 | http://purl.org/cognite/neat/neatId_dd1695fa_6... | neat_space:Asset | WMT_SYSTEM_ID | long | 1 | prefix_1:Asset(prefix_1:WMT_SYSTEM_ID) |
9 | http://purl.org/cognite/neat/neatId_45fa22f9_2... | neat_space:Asset | WMT_TAG_STATUSCHGDATE | string | 1 | prefix_1:Asset(prefix_1:WMT_TAG_STATUSCHGDATE) |
10 | http://purl.org/cognite/neat/neatId_d27382d6_b... | neat_space:Asset | WMT_PO_ID | long | 1 | prefix_1:Asset(prefix_1:WMT_PO_ID) |
11 | http://purl.org/cognite/neat/neatId_96cbf651_9... | neat_space:Asset | WMT_TAG_UPDATED_BY | long | 1 | prefix_1:Asset(prefix_1:WMT_TAG_UPDATED_BY) |
12 | http://purl.org/cognite/neat/neatId_a541a9e9_f... | neat_space:Asset | WMT_TAG_LOOP | string | 1 | prefix_1:Asset(prefix_1:WMT_TAG_LOOP) |
13 | http://purl.org/cognite/neat/neatId_f647a0d1_0... | neat_space:Asset | WMT_TAG_ID | long | 1 | prefix_1:Asset(prefix_1:WMT_TAG_ID) |
14 | http://purl.org/cognite/neat/neatId_9d6cb5b4_1... | neat_space:Asset | WMT_CATEGORY_ID | long | 1 | prefix_1:Asset(prefix_1:WMT_CATEGORY_ID) |
15 | http://purl.org/cognite/neat/neatId_d02d0835_d... | neat_space:Asset | WMT_SAFETYCRITICALELEMENT_ID | long | 1 | prefix_1:Asset(prefix_1:WMT_SAFETYCRITICALELEM... |
16 | http://purl.org/cognite/neat/neatId_a8bcb23f_4... | neat_space:Asset | WMT_CONTRACTOR_ID | long | 1 | prefix_1:Asset(prefix_1:WMT_CONTRACTOR_ID) |
17 | http://purl.org/cognite/neat/neatId_1da2db34_b... | neat_space:Asset | WMT_TAG_GLOBALID | long | 1 | prefix_1:Asset(prefix_1:WMT_TAG_GLOBALID) |
18 | http://purl.org/cognite/neat/neatId_3344b741_e... | neat_space:Asset | WMT_AREA_ID | long | 1 | prefix_1:Asset(prefix_1:WMT_AREA_ID) |
19 | http://purl.org/cognite/neat/neatId_a1e184ec_7... | neat_space:Asset | WMT_TAG_ISOWNEDBYPROJECT | long | 1 | prefix_1:Asset(prefix_1:WMT_TAG_ISOWNEDBYPROJECT) |
20 | http://purl.org/cognite/neat/neatId_dcab2921_6... | neat_space:Asset | WMT_TAG_ID_ANCESTOR | long | 1 | prefix_1:Asset(prefix_1:WMT_TAG_ID_ANCESTOR) |
21 | http://purl.org/cognite/neat/neatId_610cd145_d... | neat_space:Asset | WMT_TAG_DESC | string | 1 | prefix_1:Asset(prefix_1:WMT_TAG_DESC) |
22 | http://purl.org/cognite/neat/neatId_658b570c_4... | neat_space:Asset | WMT_LOCATION_ID | long | 1 | prefix_1:Asset(prefix_1:WMT_LOCATION_ID) |
23 | http://purl.org/cognite/neat/neatId_972e2870_3... | neat_space:Asset | WMT_FUNC_CODE_ID | long | 1 | prefix_1:Asset(prefix_1:WMT_FUNC_CODE_ID) |
24 | http://purl.org/cognite/neat/neatId_89009351_1... | neat_space:Asset | WMT_TAG_UPDATED_DATE | string | 1 | prefix_1:Asset(prefix_1:WMT_TAG_UPDATED_DATE) |
25 | http://purl.org/cognite/neat/neatId_1d5f5f50_a... | neat_space:Asset | WMT_TAG_COMMENT | string | 1 | prefix_1:Asset(prefix_1:WMT_TAG_COMMENT) |
26 | http://purl.org/cognite/neat/neatId_d5e9f549_2... | neat_space:Asset | RES_ID | long | 1 | prefix_1:Asset(prefix_1:RES_ID) |
27 | http://purl.org/cognite/neat/neatId_105884f2_5... | neat_space:Asset | WMT_TAG_NAME | string | 1 | prefix_1:Asset(prefix_1:WMT_TAG_NAME) |
28 | http://purl.org/cognite/neat/neatId_0840a48c_2... | neat_space:Asset | WMT_TAG_SUFFIX | string | 1 | prefix_1:Asset(prefix_1:WMT_TAG_SUFFIX) |
After inspecting the properties, we notice that we have a Logical Data Model
. This cannot be written to CDF. To do that we will convert it to the dms
format
which is what CDF expects for data models.
Convert Data Model¶
neat.convert("dms")
Rules converted to dms
Success: VerifiedInformationModel → VerifiedDMSModel
neat.inspect.properties
neatId | view | view_property | value_type | nullable | is_list | container | container_property | logical | |
---|---|---|---|---|---|---|---|---|---|
0 | http://purl.org/cognite/neat/neatId_d3102f55_b... | neat_space:Asset(version=v1) | WMT_TAG_MAINID | int64 | True | False | neat_space:Asset | WMT_TAG_MAINID | http://purl.org/cognite/neat/neatId_27065821_b... |
1 | http://purl.org/cognite/neat/neatId_cdac5ee0_7... | neat_space:Asset(version=v1) | ELC_STATUS_ID | int64 | True | False | neat_space:Asset | ELC_STATUS_ID | http://purl.org/cognite/neat/neatId_9cf70305_2... |
2 | http://purl.org/cognite/neat/neatId_9116d349_1... | neat_space:Asset(version=v1) | SOURCE_TABLE | text | True | False | neat_space:Asset | SOURCE_TABLE | http://purl.org/cognite/neat/neatId_bcad3c67_7... |
3 | http://purl.org/cognite/neat/neatId_abaf3b68_d... | neat_space:Asset(version=v1) | SOURCE_DB | text | True | False | neat_space:Asset | SOURCE_DB | http://purl.org/cognite/neat/neatId_92d7751d_0... |
4 | http://purl.org/cognite/neat/neatId_a6fb18aa_3... | neat_space:Asset(version=v1) | WMT_TAG_CRITICALLINE | text | True | False | neat_space:Asset | WMT_TAG_CRITICALLINE | http://purl.org/cognite/neat/neatId_e9b68652_9... |
5 | http://purl.org/cognite/neat/neatId_f54611fc_3... | neat_space:Asset(version=v1) | WMT_TAG_HISTORYREQUIRED | text | True | False | neat_space:Asset | WMT_TAG_HISTORYREQUIRED | http://purl.org/cognite/neat/neatId_e6a10869_4... |
6 | http://purl.org/cognite/neat/neatId_e5d9fadb_1... | neat_space:Asset(version=v1) | WMT_TAG_ISACTIVE | int64 | True | False | neat_space:Asset | WMT_TAG_ISACTIVE | http://purl.org/cognite/neat/neatId_c25750d7_7... |
7 | http://purl.org/cognite/neat/neatId_85e8c9fe_4... | neat_space:Asset(version=v1) | WMT_TAG_CREATED_DATE | timestamp | True | False | neat_space:Asset | WMT_TAG_CREATED_DATE | http://purl.org/cognite/neat/neatId_2e47d057_b... |
8 | http://purl.org/cognite/neat/neatId_d7b64011_9... | neat_space:Asset(version=v1) | WMT_SYSTEM_ID | int64 | True | False | neat_space:Asset | WMT_SYSTEM_ID | http://purl.org/cognite/neat/neatId_dd1695fa_6... |
9 | http://purl.org/cognite/neat/neatId_9692d857_a... | neat_space:Asset(version=v1) | WMT_TAG_STATUSCHGDATE | text | True | False | neat_space:Asset | WMT_TAG_STATUSCHGDATE | http://purl.org/cognite/neat/neatId_45fa22f9_2... |
10 | http://purl.org/cognite/neat/neatId_c27133b7_5... | neat_space:Asset(version=v1) | WMT_PO_ID | int64 | True | False | neat_space:Asset | WMT_PO_ID | http://purl.org/cognite/neat/neatId_d27382d6_b... |
11 | http://purl.org/cognite/neat/neatId_1967e41e_a... | neat_space:Asset(version=v1) | WMT_TAG_UPDATED_BY | int64 | True | False | neat_space:Asset | WMT_TAG_UPDATED_BY | http://purl.org/cognite/neat/neatId_96cbf651_9... |
12 | http://purl.org/cognite/neat/neatId_0132fce3_b... | neat_space:Asset(version=v1) | WMT_TAG_LOOP | text | True | False | neat_space:Asset | WMT_TAG_LOOP | http://purl.org/cognite/neat/neatId_a541a9e9_f... |
13 | http://purl.org/cognite/neat/neatId_e6e1a92c_f... | neat_space:Asset(version=v1) | WMT_TAG_ID | int64 | True | False | neat_space:Asset | WMT_TAG_ID | http://purl.org/cognite/neat/neatId_f647a0d1_0... |
14 | http://purl.org/cognite/neat/neatId_033b8135_e... | neat_space:Asset(version=v1) | WMT_CATEGORY_ID | int64 | True | False | neat_space:Asset | WMT_CATEGORY_ID | http://purl.org/cognite/neat/neatId_9d6cb5b4_1... |
15 | http://purl.org/cognite/neat/neatId_35702224_b... | neat_space:Asset(version=v1) | WMT_SAFETYCRITICALELEMENT_ID | int64 | True | False | neat_space:Asset | WMT_SAFETYCRITICALELEMENT_ID | http://purl.org/cognite/neat/neatId_d02d0835_d... |
16 | http://purl.org/cognite/neat/neatId_692826a6_d... | neat_space:Asset(version=v1) | WMT_CONTRACTOR_ID | int64 | True | False | neat_space:Asset | WMT_CONTRACTOR_ID | http://purl.org/cognite/neat/neatId_a8bcb23f_4... |
17 | http://purl.org/cognite/neat/neatId_0883d447_b... | neat_space:Asset(version=v1) | WMT_TAG_GLOBALID | int64 | True | False | neat_space:Asset | WMT_TAG_GLOBALID | http://purl.org/cognite/neat/neatId_1da2db34_b... |
18 | http://purl.org/cognite/neat/neatId_61bacd95_8... | neat_space:Asset(version=v1) | WMT_AREA_ID | int64 | True | False | neat_space:Asset | WMT_AREA_ID | http://purl.org/cognite/neat/neatId_3344b741_e... |
19 | http://purl.org/cognite/neat/neatId_8fcb8297_6... | neat_space:Asset(version=v1) | WMT_TAG_ISOWNEDBYPROJECT | int64 | True | False | neat_space:Asset | WMT_TAG_ISOWNEDBYPROJECT | http://purl.org/cognite/neat/neatId_a1e184ec_7... |
20 | http://purl.org/cognite/neat/neatId_b2ecf8f6_2... | neat_space:Asset(version=v1) | WMT_TAG_ID_ANCESTOR | int64 | True | False | neat_space:Asset | WMT_TAG_ID_ANCESTOR | http://purl.org/cognite/neat/neatId_dcab2921_6... |
21 | http://purl.org/cognite/neat/neatId_f6a963df_0... | neat_space:Asset(version=v1) | WMT_TAG_DESC | text | True | False | neat_space:Asset | WMT_TAG_DESC | http://purl.org/cognite/neat/neatId_610cd145_d... |
22 | http://purl.org/cognite/neat/neatId_d718aa8b_b... | neat_space:Asset(version=v1) | WMT_LOCATION_ID | int64 | True | False | neat_space:Asset | WMT_LOCATION_ID | http://purl.org/cognite/neat/neatId_658b570c_4... |
23 | http://purl.org/cognite/neat/neatId_c402f603_d... | neat_space:Asset(version=v1) | WMT_FUNC_CODE_ID | int64 | True | False | neat_space:Asset | WMT_FUNC_CODE_ID | http://purl.org/cognite/neat/neatId_972e2870_3... |
24 | http://purl.org/cognite/neat/neatId_2919fd0b_3... | neat_space:Asset(version=v1) | WMT_TAG_UPDATED_DATE | text | True | False | neat_space:Asset | WMT_TAG_UPDATED_DATE | http://purl.org/cognite/neat/neatId_89009351_1... |
25 | http://purl.org/cognite/neat/neatId_7fc7ff68_2... | neat_space:Asset(version=v1) | WMT_TAG_COMMENT | text | True | False | neat_space:Asset | WMT_TAG_COMMENT | http://purl.org/cognite/neat/neatId_1d5f5f50_a... |
26 | http://purl.org/cognite/neat/neatId_1f99e5ba_9... | neat_space:Asset(version=v1) | RES_ID | int64 | True | False | neat_space:Asset | RES_ID | http://purl.org/cognite/neat/neatId_d5e9f549_2... |
27 | http://purl.org/cognite/neat/neatId_617d5024_6... | neat_space:Asset(version=v1) | WMT_TAG_NAME | text | True | False | neat_space:Asset | WMT_TAG_NAME | http://purl.org/cognite/neat/neatId_105884f2_5... |
28 | http://purl.org/cognite/neat/neatId_2b739535_2... | neat_space:Asset(version=v1) | WMT_TAG_SUFFIX | text | True | False | neat_space:Asset | WMT_TAG_SUFFIX | http://purl.org/cognite/neat/neatId_0840a48c_2... |
Now we see that we have information about how the data model is implemented.
We can further show the steps we have been taking so far, called the provenance of the data model.
neat.show.data_model.provenance()
data_model_provenance_bd5a993c.html
We notice that we get the default space and model identifier, so we set it to be unique.
Publish Data Model¶
neat.set.data_model_id(("sp_doctrino", "DoctrinoAssetModel", "v1"))
Success: VerifiedDMSModel → VerifiedDMSModel
Now we are ready to publish this to CDF.
neat.to.cdf.data_model()
You can inspect the details with the .inspect.outcome.data_model(...) method.
name | unchanged | |
---|---|---|
0 | spaces | 1 |
1 | containers | 1 |
2 | views | 1 |
3 | data_models | 1 |
4 | nodes | 0 |
Populate Data Model¶
Neat keeps track of the data, so we can immidiately populate this data model with the original data
neat.to.cdf.instances()
INFO | 2024-12-19 15:14:37,403 | Staring DMSLoader and will process 1 views. INFO | 2024-12-19 15:14:37,412 | Starting ViewId(space='sp_doctrino', external_id='Asset', version='v1') 1/1. INFO | 2024-12-19 15:14:42,496 | Finished ViewId(space='sp_doctrino', external_id='Asset', version='v1').
You can inspect the details with the .inspect.outcome.instances(...) method.
name | created | |
---|---|---|
0 | Asset | 1,103 |