Introduction to NEAT v1¶
Prerequisite:
- NEAT installed, see Installation
- Launched a notebook environment.
In this tutorial, we will give an introduction to NEAT v1, specifically focusing on various readers and writers of CDF data model, aka physical data model.
Instantiating NeatSession¶
NeatSession is the only public interface for NEAT features and functionality. It is a safe and convenient way to access NEAT capabilities, which guides and protects users from making mistakes.
We start by importing a NeatSession and instantiating it with a CogniteClient and configuring it with a desired configuration profile (aka governance profile).
More information about NeatSession can be found in the reference documentation. Similarly, details about the configuration can be also found in the configuration documentation.
from cognite.neat import NeatSession, get_cognite_client
You are more than welcome to use other means to instantiate CogniteClient, e.g. using cognite sdk. get_cognite_client is just a convenient helper function provided by NEAT to quickly get you started. If you have .env file with your credentials from legacy neat, it will be automatically picked up. If you do not have .env file, this method will guide you through the process of setting up authentication.
client = get_cognite_client(".env")
Found .env file in repository root. Loaded variables from .env file.
Lets start NeatSession using the legacy-additive configuration profile, which means that NEAT will be configured such that it will consider:
additivemode for data modeling- and
legacyset of validators (see print out post NeatSession init for details about which validators are excluded).
We will later change profile to deep-additive to show impact of enabling all validators in the library.
neat = NeatSession(client=client, config="legacy-additive")
Neat session started for CDF project: 'get-power-grid' (Organization: 'cog-get-power') Profile: legacy-additive Modeling Mode: additive Validation: Excluded Rules: NEAT-DMS-AI-READINESS-*, NEAT-DMS-CONNECTIONS-002, NEAT-DMS-CONNECTIONS-REVERSE-007, NEAT-DMS-CONNECTIONS-REVERSE-008, NEAT-DMS-CONSISTENCY-001
Reading data models¶
We will start reading a data model from Cognite Data Fusion and exploring possible insights we can gain from data model validators that are run during the read operation. We start off by reading Core Data Model into the NeatSession.
neat.physical_data_model.read.cdf("cdf_cdm", "CogniteCore", "v1")
Read Physical Data Model - cdf ✅
If we now switch to deep-additive profile, we can see that more validators are enabled, and there will be a large number of findings from these additional validators.
neat = NeatSession(client=client, config="deep-additive")
Neat session started for CDF project: 'get-power-grid' (Organization: 'cog-get-power') Profile: deep-additive Modeling Mode: additive Validation: All validators enabled
neat.physical_data_model.read.cdf("cdf_cdm", "CogniteCore", "v1")
Read Physical Data Model - cdf ✅ | Insights: 340 (of which 0 errors) 📋 For details on issues run .issues
To explore insights we will call neat.issues which will present us with a navigational web UI to explore all insights from the validators that were run during the read operation.
All insights are grouped by a validator code, and we can expand each validator to see individual insight (aka issue). More details about each validator can be found by clicking on the group code (e.g., NEAT-DMS-AI-READINESS-003).
neat.issues
Session Issues
Reading data model from other sources such as:
- Excel file
- JSON file
- YAML file
is done in a similar manner, just by using appropriate reader from the neat.physical_data_model.read module. For more details please refer to the physical data model reading documentation.
Writing data models¶
We will try to write back same mode we read from CDF back to CDF and inspect results from the write operation. Specifically, we will do write in dry-run mode, meaning that no actual data will be written to CDF, but details on deployment plan will be presented to the user. dry-run mode is the default mode for CDF writer.
neat.physical_data_model.write.cdf()
Write Physical Data Model - cdf ✅ 📊 For details on result run .result
Calling neat.result will present us with a navigational web UI to explore all details about dry-run, indicating what components of the data model will be:
- created
- updated
- deleted
- unchanged
- skipped
If write operation is done in non-dry-run mode, then neat.result will present us with details about what was actually got deployed to CDF, and what failed with appropriate error messages.
neat.result
Deployment Result
Similar to reading, writing to other formats such as Excel, JSON, YAML is done in a similar manner by using appropriate writer from the neat.physical_data_model.write module. For more details please refer to the physical data model writing documentation.