Extractors
cognite.neat.graph.extractors
#
BaseExtractor
#
This is the base class for all extractors. It defines the interface that extractors must implement.
Source code in cognite/neat/graph/extractors/_base.py
AssetsExtractor
#
Bases: ClassicCDFExtractor[Asset]
Extract data from Cognite Data Fusions Assets into Neat.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
items |
Iterable[Asset]
|
An iterable of assets. |
required |
namespace |
Namespace
|
The namespace to use. Defaults to DEFAULT_NAMESPACE. |
None
|
to_type |
Callable[[Asset], str | None]
|
A function to convert an asset to a type. Defaults to None. If None or if the function returns None, the asset will be set to the default type "Asset". |
None
|
total |
int
|
The total number of assets to load. If passed, you will get a progress bar if rich is installed. Defaults to None. |
None
|
limit |
int
|
The maximal number of assets to load. Defaults to None. This is typically used for testing setup of the extractor. For example, if you are extracting 100 000 assets, you might want to limit the extraction to 1000 assets to test the setup. |
None
|
unpack_metadata |
bool
|
Whether to unpack metadata. Defaults to False, which yields the metadata as a JSON string. |
True
|
skip_metadata_values |
set[str] | frozenset[str] | None
|
A set of values to skip when unpacking metadata. Defaults to frozenset({"nan", "null", "none", ""}). |
DEFAULT_SKIP_METADATA_VALUES
|
Source code in cognite/neat/graph/extractors/_classic_cdf/_assets.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
|
EventsExtractor
#
Bases: ClassicCDFExtractor[Event]
Extract data from Cognite Data Fusions Events into Neat.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
items |
Iterable[Event]
|
An iterable of items. |
required |
namespace |
Namespace
|
The namespace to use. Defaults to DEFAULT_NAMESPACE. |
None
|
to_type |
Callable[[Event], str | None]
|
A function to convert an item to a type. Defaults to None. If None or if the function returns None, the asset will be set to the default type. |
None
|
total |
int
|
The total number of items to load. If passed, you will get a progress bar if rich is installed. Defaults to None. |
None
|
limit |
int
|
The maximal number of items to load. Defaults to None. This is typically used for testing setup of the extractor. For example, if you are extracting 100 000 assets, you might want to limit the extraction to 1000 assets to test the setup. |
None
|
unpack_metadata |
bool
|
Whether to unpack metadata. Defaults to False, which yields the metadata as a JSON string. |
True
|
skip_metadata_values |
set[str] | frozenset[str] | None
|
If you are unpacking metadata, then values in this set will be skipped. |
DEFAULT_SKIP_METADATA_VALUES
|
Source code in cognite/neat/graph/extractors/_classic_cdf/_events.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 |
|
FilesExtractor
#
Bases: ClassicCDFExtractor[FileMetadata]
Extract data from Cognite Data Fusions files metadata into Neat.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
items |
Iterable[FileMetadata]
|
An iterable of items. |
required |
namespace |
Namespace
|
The namespace to use. Defaults to DEFAULT_NAMESPACE. |
None
|
to_type |
Callable[[FileMetadata], str | None]
|
A function to convert an item to a type. Defaults to None. If None or if the function returns None, the asset will be set to the default type. |
None
|
total |
int
|
The total number of items to load. If passed, you will get a progress bar if rich is installed. Defaults to None. |
None
|
limit |
int
|
The maximal number of items to load. Defaults to None. This is typically used for testing setup of the extractor. For example, if you are extracting 100 000 assets, you might want to limit the extraction to 1000 assets to test the setup. |
None
|
unpack_metadata |
bool
|
Whether to unpack metadata. Defaults to False, which yields the metadata as a JSON string. |
True
|
skip_metadata_values |
set[str] | frozenset[str] | None
|
If you are unpacking metadata, then values in this set will be skipped. |
DEFAULT_SKIP_METADATA_VALUES
|
Source code in cognite/neat/graph/extractors/_classic_cdf/_files.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
|
LabelsExtractor
#
Bases: ClassicCDFExtractor[LabelDefinition]
Extract data from Cognite Data Fusions Labels into Neat.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
items |
Iterable[LabelDefinition]
|
An iterable of items. |
required |
namespace |
Namespace
|
The namespace to use. Defaults to DEFAULT_NAMESPACE. |
None
|
to_type |
Callable[[LabelDefinition], str | None]
|
A function to convert an item to a type. Defaults to None. If None or if the function returns None, the asset will be set to the default type. |
None
|
total |
int
|
The total number of items to load. If passed, you will get a progress bar if rich is installed. Defaults to None. |
None
|
limit |
int
|
The maximal number of items to load. Defaults to None. This is typically used for testing setup of the extractor. For example, if you are extracting 100 000 assets, you might want to limit the extraction to 1000 assets to test the setup. |
None
|
unpack_metadata |
bool
|
Whether to unpack metadata. Defaults to False, which yields the metadata as a JSON string. |
True
|
skip_metadata_values |
set[str] | frozenset[str] | None
|
If you are unpacking metadata, then values in this set will be skipped. |
DEFAULT_SKIP_METADATA_VALUES
|
Source code in cognite/neat/graph/extractors/_classic_cdf/_labels.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 |
|
RelationshipsExtractor
#
Bases: ClassicCDFExtractor[Relationship]
Extract data from Cognite Data Fusions Relationships into Neat.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
items |
Iterable[Relationship]
|
An iterable of items. |
required |
namespace |
Namespace
|
The namespace to use. Defaults to DEFAULT_NAMESPACE. |
None
|
to_type |
Callable[[Relationship], str | None]
|
A function to convert an item to a type. Defaults to None. If None or if the function returns None, the asset will be set to the default type. |
None
|
total |
int
|
The total number of items to load. If passed, you will get a progress bar if rich is installed. Defaults to None. |
None
|
limit |
int
|
The maximal number of items to load. Defaults to None. This is typically used for testing setup of the extractor. For example, if you are extracting 100 000 assets, you might want to limit the extraction to 1000 assets to test the setup. |
None
|
unpack_metadata |
bool
|
Whether to unpack metadata. Defaults to False, which yields the metadata as a JSON string. |
True
|
skip_metadata_values |
set[str] | frozenset[str] | None
|
If you are unpacking metadata, then values in this set will be skipped. |
DEFAULT_SKIP_METADATA_VALUES
|
Source code in cognite/neat/graph/extractors/_classic_cdf/_relationships.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 |
|
SequencesExtractor
#
Bases: ClassicCDFExtractor[Sequence]
Extract data from Cognite Data Fusions Sequences into Neat.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
items |
Iterable[Sequence]
|
An iterable of items. |
required |
namespace |
Namespace
|
The namespace to use. Defaults to DEFAULT_NAMESPACE. |
None
|
to_type |
Callable[[Sequence], str | None]
|
A function to convert an item to a type. Defaults to None. If None or if the function returns None, the asset will be set to the default type. |
None
|
total |
int
|
The total number of items to load. If passed, you will get a progress bar if rich is installed. Defaults to None. |
None
|
limit |
int
|
The maximal number of items to load. Defaults to None. This is typically used for testing setup of the extractor. For example, if you are extracting 100 000 assets, you might want to limit the extraction to 1000 assets to test the setup. |
None
|
unpack_metadata |
bool
|
Whether to unpack metadata. Defaults to False, which yields the metadata as a JSON string. |
True
|
skip_metadata_values |
set[str] | frozenset[str] | None
|
If you are unpacking metadata, then values in this set will be skipped. |
DEFAULT_SKIP_METADATA_VALUES
|
Source code in cognite/neat/graph/extractors/_classic_cdf/_sequences.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 |
|
TimeSeriesExtractor
#
Bases: ClassicCDFExtractor[TimeSeries]
Extract data from Cognite Data Fusions TimeSeries into Neat.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
items |
Iterable[TimeSeries]
|
An iterable of items. |
required |
namespace |
Namespace
|
The namespace to use. Defaults to DEFAULT_NAMESPACE. |
None
|
to_type |
Callable[[TimeSeries], str | None]
|
A function to convert an item to a type. Defaults to None. If None or if the function returns None, the asset will be set to the default type. |
None
|
total |
int
|
The total number of items to load. If passed, you will get a progress bar if rich is installed. Defaults to None. |
None
|
limit |
int
|
The maximal number of items to load. Defaults to None. This is typically used for testing setup of the extractor. For example, if you are extracting 100 000 assets, you might want to limit the extraction to 1000 assets to test the setup. |
None
|
unpack_metadata |
bool
|
Whether to unpack metadata. Defaults to False, which yields the metadata as a JSON string. |
True
|
skip_metadata_values |
set[str] | frozenset[str] | None
|
If you are unpacking metadata, then values in this set will be skipped. |
DEFAULT_SKIP_METADATA_VALUES
|
Source code in cognite/neat/graph/extractors/_classic_cdf/_timeseries.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 |
|
DexpiExtractor
#
Bases: BaseExtractor
DEXPI-XML extractor of RDF triples
Parameters:
Name | Type | Description | Default |
---|---|---|---|
root |
Element
|
XML root element of DEXPI file. |
required |
namespace |
Namespace | None
|
Optional custom namespace to use for extracted triples that define data model instances. Defaults to DEFAULT_NAMESPACE. |
None
|
Source code in cognite/neat/graph/extractors/_dexpi.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 |
|
extract()
#
DMSExtractor
#
Bases: BaseExtractor
Extract data from Cognite Data Fusion DMS instances into Neat.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
items |
Iterable[Instance]
|
The items to extract. |
required |
total |
int | None
|
The total number of items to extract. If provided, this will be used to estimate the progress. |
None
|
limit |
int | None
|
The maximum number of items to extract. |
None
|
overwrite_namespace |
Namespace | None
|
If provided, this will overwrite the space of the extracted items. |
None
|
Source code in cognite/neat/graph/extractors/_dms.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 |
|
from_data_model(client, data_model, limit=None)
classmethod
#
Create an extractor from a data model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
client |
CogniteClient
|
The Cognite client to use. |
required |
data_model |
DataModelIdentifier
|
The data model to extract. |
required |
limit |
int | None
|
The maximum number of instances to extract. |
None
|
Source code in cognite/neat/graph/extractors/_dms.py
from_views(client, views, limit=None)
classmethod
#
Create an extractor from a set of views.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
client |
CogniteClient
|
The Cognite client to use. |
required |
views |
Iterable[View]
|
The views to extract. |
required |
limit |
int | None
|
The maximum number of instances to extract. |
None
|
Source code in cognite/neat/graph/extractors/_dms.py
MockGraphGenerator
#
Bases: BaseExtractor
Class used to generate mock graph data for purposes of testing of NEAT.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
rules |
InformationRules | DMSRules
|
Transformation rules defining the classes with their properties. |
required |
class_count |
dict[str | ClassEntity, int] | None
|
Target class count for each class in the ontology |
None
|
stop_on_exception |
bool
|
To stop if exception is encountered or not, default is False |
False
|
allow_isolated_classes |
bool
|
To allow generation of instances for classes that are not connected to any other class, default is True |
True
|
Source code in cognite/neat/graph/extractors/_mock_graph_generator.py
extract()
#
Generate mock triples based on data model defined transformation rules and desired number of class instances
Returns:
Type | Description |
---|---|
list[Triple]
|
List of RDF triples, represented as tuples |
Source code in cognite/neat/graph/extractors/_mock_graph_generator.py
RdfFileExtractor
#
Bases: BaseExtractor
Extract data from RDF files into Neat.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
filepath |
Path
|
The path to the RDF file. |
required |
mime_type |
MIMETypes
|
The MIME type of the RDF file. Defaults to "application/rdf+xml". |
'application/rdf+xml'
|
base_uri |
URIRef
|
The base URI to use. Defaults to None. |
None
|