SEISNC conventions for xarray
The SEGY-SAK Xarray seismic specification termed SEISNC can be used by SEGY-SAK to
create and manage Xarray Datasets with details and attributes common to seismic data.
Xarray uses NetCDF4 as its standard representation and this fits neatly with regularly
sampled seismic data (e.g. 3D data).
Moreover, unlike SEG-Y, xarray
compatible formats fit neatly into the Python scientific
stack providing more performant operations like lazy loading, easy slicing,
compatibility with multi-core and multi-node operations using dask
as well
as important features such as labelled axes and coordinates.
This specification is not meant to be prescriptive but outlines some basic
requirements for xarray
datasets to work effectively with SEGY-SAK functionality.
3D data dimensions
SEGY-SAK uses the convention labels of iline
, xline
to describe
the bins of 3D data. The vertical dimension is labelled samples
and it is up to the
user to understand the domain (usually TWT or depth).
A typical xarray
dataset created by SEGY-SAK will return for example
>>> seisnc_3d = xr.open_dataset('test3d.sgy', dim_byte_fields={"iline":189, "xline":193})
>>> seisnc_3d.sizes
Frozen({'iline': 61, 'xline': 202, 'samples': 850})
Frozen standard names are supplied via the HorDimKeyField
.
2D data dimensions
For 2D data SEGY-SAK uses the dimension label cdp
for CDP locations. This allows
the package to distinguish between 2D and 3D data to allow automation on saving
and convenience wrappers. The same vertical dimensions apply as for 3D.
A typical xarray
in 2D format would return
>>> seisnc_2d = xr.open_dataset('test3d.sgy', dim_byte_fields={"cdp": 21})
>>> seisnc_2d.sizes
Frozen({'cdp': 61, 'twt': 850})
Frozen standard names are supplied via the HorDimKeyField
.
Gathers / Pre-stack data / Extra dimensions
SEGY-SAK supports an arbitrary number of extra dimensions which might be written to SEG-Y. The most common types of data with extra dimensions are gathers also known pre-stack data. Often this data is either exported as angle or offset dependent gathers. SEGY-SAK provides standard names for these dimensions.
Frozen standard names are supplied via the DimKeyField
.
Map Coordinates
Map coordinates describe the physical location on earth of the seismic data. Rarely do these coordinates align nicely with a regular grid due to rotation of the seismic bin grid relative to the map grid. Therefore it is typically not feasible to use map coordinates for the Dataset dimensions.
Instead, X and Y map coordinates must be loading into the Dataset variables on regular dimensions such as
iline
/xline
or cdp
. SEGY-SAK by convention labels the X and Y map coordinate variables cdp_x
and cdp_y
.
Other variable names can be used if users use the ds.segysak.set_coords()
method.
>>> seisnc_3d = xr.open_dataset(
'test3d.sgy', dim_byte_fields={"iline":189, "xline":193}, extra_byte_fields={"cdp_x": 73, "cdp_y": 77}
)
>>> seisnc_3d.sizes
Frozen({'iline': 61, 'xline': 202, 'samples': 850})
>>> seisnc_3d.data_vars
Data variables:
cdp_x (iline, xline) float64 99kB 4.364e+05 4.364e+05 ... 4.341e+05
cdp_y (iline, xline) float64 99kB 6.477e+06 6.477e+06 ... 6.479e+06
data (iline, xline, samples) float32 42MB ...
Frozen standard names are supplied via the CoordKeyField
.
Attributes
Xarray Datasets have an attributes dictionary available from ds.attrs
.
This works well for simple attributes but SEGY-SAK found a need for more complex variable types not supported by standard
NetCDF4 as well as a need for a protected attribute area. To meet this need, SEGY-SAK implements it's own attrs
as a
JSON encoded string under ds.attrs['seisnc']
. Access to the variables in this string are via the segysak
Dataset
accessor (an extension for Xarray) ds.segysak.attrs
.
Any number of attributes can be added to a siesnc
file. Currently the
following attributes are extracted or reserved for use by SEGY-SAK. Attributes marked by are depreciated.
Attribute Name | type | Purpose |
---|---|---|
ns |
int | number of samples per trace |
ds |
float | sample interval |
text |
str | ebcidc header as ascii text |
measurement_sys |
vertical units of the data | |
d3_domain |
vertical domain of the data | |
epsg |
str | data epsg code |
corner_points |
tuple | corner points of the dataset in grid coordinates |
corner_points_xy |
tuple | corner points of the dataset in xy |
source_file |
str | name of the file the dataset was created from |
srd |
float | seismic reference datum of the data in vertical units measurement_sys and d3_domain |
datatype |
str | the data type e.g. amplitude, velocity, attribute |
percentiles |
tuple | this is an array of approximate percentile values created during scanning from SEG-Y. Primarily this is useful for plotting by limiting the dynamic range of the display. The percentiles are in percent 0, 0.1, 10, 50, 90, 99.9 & |
coord_scalar |
int | the SEG-Y coordinate scalar |
coord_scaled |
A boolean that indicates if the coordinates have been scaled or not. | |
dimensions |
A mapping of DimKeyField and HorKeyField to Dataset dimensions. |
|
vert_domain |
A member of VerticalKeyDim . |
Frozen standard names are supplied via the AttrKeyField
.
KeyFields
Keyfields represent standard names for variables, dimensions and attributes that SEGY-SAK uses internally
to access relevant data. It is helpful to stick to these conventions when working with SEGY-SAK. Alternatively
it is possible to set a mapping to your chosen names using the set_dimensions
, and set_coords
methods in
the accessor module segysak
.
The frozen dataclass
are prefixed by _
while instantiated dataclasses for general use have no prefix.
Frozen base classes
Info
These classes are referenced here to show the standard seisnc variable names. Use the instantiated classes if including these keyfields in your code.
Enumerations of dataset keys.
_HorDimKeyField
dataclass
Bases: MetaDataClass
Horizontal Dimensions of seismic data.
Source code in segysak/_keyfield.py
Python | |
---|---|
9 10 11 12 13 14 15 |
|
_VerDimKeyField
dataclass
Bases: MetaDataClass
Vertical Dimensions of seismic data.
Source code in segysak/_keyfield.py
Python | |
---|---|
21 22 23 24 25 26 27 |
|
_DimKeyField
dataclass
Bases: _HorDimKeyField
, _VerDimKeyField
Dimension keys to use for naming.
Source code in segysak/_keyfield.py
Python | |
---|---|
33 34 35 36 37 38 |
|
_CoordKeyField
dataclass
Bases: MetaDataClass
Coordinate Keys to use for naming key components of the Dataset
Source code in segysak/_keyfield.py
Python | |
---|---|
44 45 46 47 48 49 50 51 |
|
_VariableKeyField
dataclass
Bases: MetaDataClass
Source code in segysak/_keyfield.py
Python | |
---|---|
148 149 150 |
|
_AttrKeyField
dataclass
Bases: MetaDataClass
Source code in segysak/_keyfield.py
Python | |
---|---|
156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 |
|
_VerticalKeyDim
dataclass
Bases: MetaDataClass
Source code in segysak/_keyfield.py
Python | |
---|---|
189 190 191 192 193 |
|
Instantiated classes
from segysak import (
HorDimKeyField,
VerDimKeyField,
DimKeyField,
CoordKeyField,
VariableKeyField,
AttrKeyField,
VerticalKeyDim,
)