Skip to content

Prepare

cognite.neat._session._prepare.PrepareAPI #

Apply various operations on the knowledge graph as a necessary preprocessing step before for instance inferring a data model or exporting the knowledge graph to a desired destination.

cognite.neat._session._prepare.InstancePrepareAPI #

Operations to perform on instances of data in the knowledge graph.

make_connection_on_exact_match(source, target, connection=None, limit=100) #

Make connection on exact match.

Parameters:

Name Type Description Default
source tuple[str, str]

The source of the connection. A tuple of (type, property) where where property is the property that should be matched on the source to make the connection with the target.

required
target tuple[str, str]

The target of the connection. A tuple of (type, property) where where property is the property that should be matched on the target to make the connection with the source.

required
connection str | None

new property to use for the connection. If None, the connection will be made by lowercasing the target type.

None
limit int | None

The maximum number of connections to make. If None, all connections

100

Make Connection on Exact Match

This method will make a connection between the source and target based on the exact match: (SourceType)-[sourceProperty]->(sourceValue) == (TargetType)-[targetProperty]->(targetValue)

The connection will be made by creating a new property on the source type that will contain the target value, as follows: (SourceType)-[connection]->(TargetType)

Example

Make connection on exact match:

# From an active NeatSession
neat.read.csv("workitem.Table.csv",
              type = "Activity",
              primary_key="sourceId")

neat.read.csv("assets.Table.csv",
              type="Asset",
              primary_key="WMT_TAG_GLOBALID")

# Here we specify what column from the source table we should use when we link it with a column in the
# target table. In this case, it is the "workorderItemname" column in the source table
source = ("Activity", "workorderItemname")

# Here we give a name to the new property that is created when a match between the source and target is
# found
connection = "asset"

# Here we specify what column from the target table we should use when searching for a match.
# In this case, it is the "wmtTagName" column in the target table
target = ("Asset", "wmtTagName")

neat.prepare.instances.make_connection_on_exact_match(source, target, connection)

relationships_as_edges(min_relationship_types=1, limit_per_type=None) #

This assumes that you have read a classic CDF knowledge graph including relationships.

This method converts relationships into edges in the graph. This is useful as the edges will be picked up as part of the schema connected to Assets, Events, Files, Sequences, and TimeSeries in the InferenceImporter.

Parameters:

Name Type Description Default
min_relationship_types int

The minimum number of relationship types that must exists to convert those relationships to edges. For example, if there is only 5 relationships between Assets and TimeSeries, and limit is 10, those relationships will not be converted to edges.

1
limit_per_type int | None

The number of conversions to perform per relationship type. For example, if there are 10 relationships between Assets and TimeSeries, and limit_per_type is 1, only 1 of those relationships will be converted to an edge. If None, all relationships will be converted.

None

convert_data_type(source, *, convert=None) #

Convert the data type of the given property.

This is, for example, useful when you have a boolean property that you want to convert to an enum.

Parameters:

Name Type Description Default
source tuple[str, str]

The source of the conversion. A tuple of (type, property) where property is the property that should be converted.

required
convert Callable[[Any], Any] | None

The function to use for the conversion. The function should take the value of the property as input and return the converted value. Default to assume you have a string that should be converted to int, float, bool, or datetime.

None
Example

Convert a boolean property to a string:

neat.prepare.instances.convert_data_type(
    ("TimeSeries", "isString"),
    convert=lambda is_string: "string" if is_string else "numeric"
)

property_to_type(source, type, new_property=None) #

Convert a property to a new type.

Parameters:

Name Type Description Default
source tuple[str | None, str]

The source of the conversion. A tuple of (type, property) where property is the property that should be converted. You can pass (None, property) to covert all properties with the given name.

required
type str

The new type of the property.

required
new_property str | None

Add the identifier as a new property. If None, the new entity will not have a property.

None
Example

Convert the property 'source' to SourceSystem

neat.prepare.instances.property_to_type(
    (None, "source"), "SourceSystem"
)

connection_to_data_type(source) #

Converts a connection to a data type.

Parameters:

Name Type Description Default
source tuple[str | None, str]

The source of the conversion. A tuple of (type, property) where property is the property that should be converted. You can pass (None, property) to covert all properties with the given name.

required

Example:

Convert all properties 'labels' from a connection to a string:

```python
neat.prepare.instances.connection_to_data_type(
    (None, "labels")
)
```

cognite.neat._session._prepare.DataModelPrepareAPI #

Operations to perform on a data model as part of a workflow before writing the data model to a desired destination.

prefix(prefix) #

Prefix all views in the data model with the given prefix.

Parameters:

Name Type Description Default
prefix str

The prefix to add to the views in the data model.

required

standardize_naming() #

Standardize the naming of all views/classes/properties in the data model.

For classes/views/containers, the naming will be standardized to PascalCase. For properties, the naming will be standardized to camelCase.

standardize_space_and_version() #

Standardize space and version in the data model.

This method will standardize the space and version in the data model to the Cognite standard.