Skip to content

Prepare

cognite.neat._session._prepare.PrepareAPI #

Apply various operations on the knowledge graph as a necessary preprocessing step before for instance inferring a data model or exporting the knowledge graph to a desired destination.

cognite.neat._session._prepare.InstancePrepareAPI #

Operations to perform on instances of data in the knowledge graph.

dexpi() #

Prepares extracted DEXPI graph for further usage in CDF

This method bundles several graph transformers which

  • attach values of generic attributes to nodes
  • create associations between nodes
  • remove unused generic attributes
  • remove associations between nodes that do not exist in the extracted graph
  • remove edges to nodes that do not exist in the extracted graph

and therefore safeguard CDF from a bad graph

Example

Apply Dexpi specific transformations:

neat.prepare.instances.dexpi()

aml() #

Prepares extracted AutomationML graph for further usage in CDF

This method bundles several graph transformers which

  • attach values of attributes to nodes
  • remove unused attributes
  • remove edges to nodes that do not exist in the extracted graph

and therefore safeguard CDF from a bad graph

Example

Apply AML specific transformations:

neat.prepare.instances.aml()

make_connection_on_exact_match(source, target, connection=None, limit=100) #

Make connection on exact match.

Parameters:

Name Type Description Default
source tuple[str, str]

The source of the connection. A tuple of (type, property) where where property is the property that should be matched on the source to make the connection with the target.

required
target tuple[str, str]

The target of the connection. A tuple of (type, property) where where property is the property that should be matched on the target to make the connection with the source.

required
connection str | None

new property to use for the connection. If None, the connection will be made by lowercasing the target type.

None
limit int | None

The maximum number of connections to make. If None, all connections

100

Make Connection on Exact Match

This method will make a connection between the source and target based on the exact match: (SourceType)-[sourceProperty]->(sourceValue) == (TargetType)-[targetProperty]->(targetValue)

The connection will be made by creating a new property on the source type that will contain the target value, as follows: (SourceType)-[connection]->(TargetType)

Example

Make connection on exact match:

# From an active NeatSession
neat.read.csv("workitem.Table.csv",
              type = "Activity",
              primary_key="sourceId")

neat.read.csv("assets.Table.csv",
              type="Asset",
              primary_key="WMT_TAG_GLOBALID")

# Here we specify what column from the source table we should use when we link it with a column in the
# target table. In this case, it is the "workorderItemname" column in the source table
source = ("Activity", "workorderItemname")

# Here we give a name to the new property that is created when a match between the source and target is
# found
connection = "asset"

# Here we specify what column from the target table we should use when searching for a match.
# In this case, it is the "wmtTagName" column in the target table
target = ("Asset", "wmtTagName")

neat.prepare.instances.make_connection_on_exact_match(source, target, connection)

relationships_as_edges(min_relationship_types=1, limit_per_type=None) #

This assumes that you have read a classic CDF knowledge graph including relationships.

This method converts relationships into edges in the graph. This is useful as the edges will be picked up as part of the schema connected to Assets, Events, Files, Sequences, and TimeSeries in the InferenceImporter.

Parameters:

Name Type Description Default
min_relationship_types int

The minimum number of relationship types that must exists to convert those relationships to edges. For example, if there is only 5 relationships between Assets and TimeSeries, and limit is 10, those relationships will not be converted to edges.

1
limit_per_type int | None

The number of conversions to perform per relationship type. For example, if there are 10 relationships between Assets and TimeSeries, and limit_per_type is 1, only 1 of those relationships will be converted to an edge. If None, all relationships will be converted.

None

convert_data_type(source, *, convert=None) #

Convert the data type of the given property.

This is, for example, useful when you have a boolean property that you want to convert to an enum.

Parameters:

Name Type Description Default
source tuple[str, str]

The source of the conversion. A tuple of (type, property) where property is the property that should be converted.

required
convert Callable[[Any], Any] | None

The function to use for the conversion. The function should take the value of the property as input and return the converted value. Default to assume you have a string that should be converted to int, float, bool, or datetime.

None
Example

Convert a boolean property to a string:

neat.prepare.instances.convert_data_type(
    ("TimeSeries", "isString"),
    convert=lambda is_string: "string" if is_string else "numeric"
)

property_to_type(source, type, new_property=None) #

Convert a property to a new type.

Parameters:

Name Type Description Default
source tuple[str | None, str]

The source of the conversion. A tuple of (type, property) where property is the property that should be converted. You can pass (None, property) to covert all properties with the given name.

required
type str

The new type of the property.

required
new_property str | None

Add the identifier as a new property. If None, the new entity will not have a property.

None
Example

Convert the property 'source' to SourceSystem

neat.prepare.instances.property_to_type(
    (None, "source"), "SourceSystem"
)

connection_to_data_type(source) #

Converts a connection to a data type.

Parameters:

Name Type Description Default
source tuple[str | None, str]

The source of the conversion. A tuple of (type, property) where property is the property that should be converted. You can pass (None, property) to covert all properties with the given name.

required

Example:

Convert all properties 'labels' from a connection to a string:

```python
neat.prepare.instances.connection_to_data_type(
    (None, "labels")
)
```

cognite.neat._session._prepare.DataModelPrepareAPI #

Operations to perform on a data model as part of a workflow before writing the data model to a desired destination.

cdf_compliant_external_ids() #

Convert data model component external ids to CDF compliant entities.

prefix(prefix) #

Prefix all views in the data model with the given prefix.

Parameters:

Name Type Description Default
prefix str

The prefix to add to the views in the data model.

required

to_enterprise(data_model_id, org_name='My', dummy_property='GUID', move_connections=False) #

Uses the current data model as a basis to create enterprise data model

Parameters:

Name Type Description Default
data_model_id DataModelIdentifier

The enterprise data model id that is being created

required
org_name str

Organization name to use for the views in the enterprise data model.

'My'
dummy_property str

The dummy property to use as placeholder for the views in the new data model.

'GUID'
move_connections bool

If True, the connections will be moved to the new data model.

False

Enterprise Data Model Creation

Always create an enterprise data model from a Cognite Data Model as this will assure all the Cognite Data Fusion applications to run smoothly, such as - Search - Atlas AI - ...

Move Connections

If you want to move the connections to the new data model, set the move_connections to True. This will move the connections to the new data model and use new model views as the source and target views.

to_solution(data_model_id, org_name='My', mode='read', dummy_property='GUID') #

Uses the current data model as a basis to create solution data model

Parameters:

Name Type Description Default
data_model_id DataModelIdentifier

The solution data model id that is being created.

required
org_name str

Organization name to use for the views in the new data model.

'My'
mode Literal['read', 'write']

The mode of the solution data model. Can be either "read" or "write".

'read'
dummy_property str

The dummy property to use as placeholder for the views in the new data model.

'GUID'

Solution Data Model Mode

The read-only solution model will only be able to read from the existing containers from the enterprise data model, therefore the solution data model will not have containers in the solution data model space. Meaning the solution data model views will be read-only.

The write mode will have additional containers in the solution data model space, allowing in addition to reading through the solution model views, also writing to the containers in the solution data model space.

to_data_product(data_model_id, org_name='', include='same-space') #

Uses the current data model as a basis to create data product data model.

A data product model is a data model that ONLY maps to containers and do not use implements. This is typically used for defining the data in a data product.

Parameters:

Name Type Description Default
data_model_id DataModelIdentifier

The data product data model id that is being created.

required
org_name str

Organization name used as prefix if the model is building on top of a Cognite Data Model.

''
include Literal['same-space', 'all']

The views to include in the data product data model. Can be either "same-space" or "all". If you set same-space, only the properties of the views in the same space as the data model will be included.

'same-space'

reduce(drop) #

This is a special method that allow you to drop parts of the data model. This only applies to Cognite Data Models.

Parameters:

Name Type Description Default
drop Collection[Literal['3D', 'Annotation', 'BaseViews'] | str]

What to drop from the data model. The values 3D, Annotation, and BaseViews are special values that drops multiple views at once. You can also pass externalIds of views to drop individual views.

required

include_referenced() #

Include referenced views and containers in the data model.