Skip to content

Xarray Accessor modules

Accessor modules are accessed as namespaces within the Xarray.Dataset objects created by SEGY-SAK. When segysak is imported, all xarray.Dataset objects will contain the .segysak and .seisio namespaces.

SeisIO

to_segy(segy_file, use_text=True, coord_scalar=None, data_var='data', vert_dimension='samples', write_dead_traces=False, dead_trace_var=None, trace_header_map=None, **dim_kwargs)

Output Xarray Dataset to SEG-Y format.

Parameters:

Name Type Description Default
segy_file Union[str, PathLike]

The output file to write to.

required
use_text bool

Write the text attribute from the dataset to the text header.

True
coord_scalar Union[float, None]

A SEG-Y compatible coordinate scalar.

None
data_var str

The variable name of the trace data in the Dataset.

'data'
vert_dimension str

The vertical (samples) dimension of the data_var.

'samples'
write_dead_traces bool

Write dead traces as zeros to the SEG-Y file.

False
dead_trace_var Union[str, None]

A dataset variable containing boolean values that identifies dead traces on the non-vertical dimension.

None
trace_header_map Dict[str, int]

Defaults to None. A dictionary of Dataset variables and byte locations. The variable will be written to the trace headers in the assigned byte location. By default CMP=23, cdp_x=181, cdp_y=185, iline=189, xline=193.

None
dim_kwargs int

The dimension/byte location pairs to output dimensions to. The number of dim_kwargs should be equal to the number of dimensions on the output data_array. The trace sort order will be as per the order passed to the function.

{}

Example

Python
ds3d.seisio.to_segy(
    "output_file.segy",
    use_text = True,
    vert_dimension = "samples",
    trace_header_map = {'cdp_x':181, 'cdp_y':185}
    iline = 189,
    xline = 193,
)

SegysakDataArrayAccessor

Bases: TemplateAccessor

attrs: Dict[str, Any] property

Return the seisnc attributes for this Xarray object

Note

The ds.segysak.attrs object is stored as a JSON string in ds.attrs['seisnc']. Doing this allows for normal methods to export the seisnc values. Accessing the ds.segysak.attrs automatically deserialises the string into a Python dict.

Example

Python
>>> ds.segysak.attrs
{'coord_scalar':-100}

humanbytes: str property

Prints Human Friendly size of Dataset to return the bytes as an int use xarray.Dataset.nbytes

Returns:

Name Type Description
str str

Human readable size of dataset.

__getitem__(key)

Return an attribute key from ds.segysak.attrs

__setitem__(key, value)

Set an attribute key to value in ds.segysak.attrs

Note

The value must be a JSON compatible object such as an int, string or list.

get(key, default=None)

Returns the value of key else default from ds.segysak.attrs similar to dict.get

get_dimensions()

Returns ds.segysak.attrs['dimensions'] if available. Else use ds.segysak.infer_dimensions().

infer_dimensions(ignore=None)

Infer the dimensions from those available in the dataset/array. They should match good seisnc dimension names from CoordKeyField

Parameters:

Name Type Description Default
ignore List[str]

A list of dimensions to ignore when considering mappings.

None

set_dimensions(seisnc_dims=None, seisnc_vert_dim=None)

Set the dimensions of the DataArray/Dataset. This is required to ensure that other functions in this module can interpret the DataArray correctly.

Parameters:

Name Type Description Default
seisnc_dims Dict[str, str]

Pairs from segysak.CoordKeyField and dimension vars.

None
seisnc_vert_dim Union[str, None]

The variable name for the vertical dimension.

None

set_vertical_domain(vert_domain)

Set the vertical domain of the DataArray, usually twt or depth.

Parameters:

Name Type Description Default
vert_domain _VerticalKeyField

The vertical domain key to use (i.e. twt or depth)

required

store_attributes(**kwargs)

Store attributes in for seisnc. NetCDF attributes are a little limited, so we expand the capabilities by serializing and deserializing from JSON text.

Parameters:

Name Type Description Default
kwargs Dict[str, JSON]

name and JSON compatible values to store in ds.segysak.attrs

{}

SegysakDatasetAccessor

Bases: TemplateAccessor

attrs: Dict[str, Any] property

Return the seisnc attributes for this Xarray object

Note

The ds.segysak.attrs object is stored as a JSON string in ds.attrs['seisnc']. Doing this allows for normal methods to export the seisnc values. Accessing the ds.segysak.attrs automatically deserialises the string into a Python dict.

Example

Python
>>> ds.segysak.attrs
{'coord_scalar':-100}

humanbytes: str property

Prints Human Friendly size of Dataset to return the bytes as an int use xarray.Dataset.nbytes

Returns:

Name Type Description
str str

Human readable size of dataset.

__getitem__(key)

Return an attribute key from ds.segysak.attrs

__setitem__(key, value)

Set an attribute key to value in ds.segysak.attrs

Note

The value must be a JSON compatible object such as an int, string or list.

calc_corner_points()

Calculate the corner points of the 3d geometry or end points of a 2D line.

Returns:

Name Type Description
corner_points List[Tuple[str, str]]

A list of cdp_x, cdp_y pairs for the corner points of the dataset.

coordinate_df(linear_fillna=True, three_d_only=False)

Return the coordinates of a Dataset as a DataFrame. Optionally do-not fill the missing coordinates.

fill_cdpna(method='linear')

Fills NaN cdp_x and cdp_y locations, usually caused by dead traces.

Parameters:

Name Type Description Default
method str

One of 'linear', 'affine'. Linear uses a planar linear transform, whilst affine uses the derived affine transformation.

'linear'

get(key, default=None)

Returns the value of key else default from ds.segysak.attrs similar to dict.get

get_affine_transform(force_recalc=False)

Returns a matplotlib Affine forward transform dims -> (cdp_x, cdp_y)

The matrix is only calculated once and then stored in the Dataset attributes as the matrix coefficients. To recalculate the matrix set force_recalc=True.

The reverse transform (cdp_x, cdp_y) is provided by get_affine_transform().inverted().

Parameters:

Name Type Description Default
force_recalc bool

Force recalculation of affine transform matrix.

False

Returns:

Type Description
Affine2D

The forward affine transform.

get_coords()

Returns attrs['cdp_x'], attrs['cdp_y'] if exist else user infer_coords()

get_dimensions()

Returns ds.segysak.attrs['dimensions'] if available. Else use ds.segysak.infer_dimensions().

infer_coords(ignore=None)

infer_dimensions(ignore=None)

Infer the dimensions from those available in the dataset/array. They should match good seisnc dimension names from CoordKeyField

Parameters:

Name Type Description Default
ignore List[str]

A list of dimensions to ignore when considering mappings.

None

interp_line(points, bin_spacing_hint=10.0, line_method='slinear', xysel_method='linear')

Select data at regular intervals along a set of path segments in X, Y defined by points.

Parameters:

Name Type Description Default
points array
required
bin_spacing_hint float

A bin spacing to stay close to, in cdp world units. Default: 10

10.0
line_method

Valid values for the kind argument in scipy.interpolate.interp1d

'slinear'
xysel_method

Valid values for DataArray.interp for the data interpolation to the path.

'linear'

Returns:

Type Description
Dataset

Interpolated dataset on points: Interpolated traces along the arbitrary line

plot_bounds(ax=None)

Plot survey bounding box to a new or existing matplotlib.Axis.

Parameters:

Name Type Description Default
ax Axis

Axis to plot onto.

None

Returns:

Type Description
Axis

matplotlib Axis

scale_coords(coord_scalar=None)

Scale the dataset coordinates using a SEG-Y coord_scalar.

The coordinate multiplier is given by:

coord_scalar_mult = np.power(abs(coord_scalar), np.sign(coord_scalar))

Or

scalar multiplier
1000 1000
100 100
10 10
1 1
0 1
-1 1
-10 0.1
-100 0.01
-1000 0.001

set_coords(seisnc_coords)

Set the seisnc coordinate mapping. This maps coordinate variable names to known seisnc coordinates.

Parameters:

Name Type Description Default
seisnc_coords Dict[Union[_CoordKeyField, str], str]

A mapping of coordinates to known seisnc standard Keyfields.

required

set_dimensions(seisnc_dims=None, seisnc_vert_dim=None)

Set the dimensions of the DataArray/Dataset. This is required to ensure that other functions in this module can interpret the DataArray correctly.

Parameters:

Name Type Description Default
seisnc_dims Dict[str, str]

Pairs from segysak.CoordKeyField and dimension vars.

None
seisnc_vert_dim Union[str, None]

The variable name for the vertical dimension.

None

store_attributes(**kwargs)

Store attributes in for seisnc. NetCDF attributes are a little limited, so we expand the capabilities by serializing and deserializing from JSON text.

Parameters:

Name Type Description Default
kwargs Dict[str, JSON]

name and JSON compatible values to store in ds.segysak.attrs

{}

xysel(points, method='nearest', sample_dim_name='cdp')

Perform selection on the dataset based upon cdp_x and cdp_y coordinates.

Parameters:

Name Type Description Default
points array

[M, 2] array of cdp_x and cdp_y points to select.

required
method str

Sampling interpolation method, as per xr.Dataset.interp().

'nearest'
sample_dim_name str

The output dimension name.

'cdp'

Returns:

Creating blank datasets

These methods help to create datasets with appropriate coordinates for seismic data.

create3d_dataset(dims, first_sample=0, sample_rate=1, first_iline=1, iline_step=1, first_xline=1, xline_step=1, first_offset=None, offset_step=None, vert_domain='TWT', vert_units=None)

Create a regular 3D seismic dataset from basic grid geometry with optional offset dimension for pre-stack data.

Parameters:

Name Type Description Default
dims tuple of int

The dimensions of the dataset to create (iline, xline, vertical). If first_offset is specified then (iline, xline, vertical, offset)

required
first_sample int

The first vertical sample. Defaults to 0.

0
sample_rate int

The vertical sample rate. Defaults to 1.

1
first_iline int

First inline number. Defaults to 1.

1
iline_step int

Inline increment. Defaults to 1.

1
first_xline int

First crossline number. Defaults to 1.

1
xline_step int

Crossline increment. Defaults to 1.

1
first_offset int / float

If not none, the offset dimension will be added starting at first offset. Defaults to None.

None
offset_step (int, float)

Required if first_offset is specified. The offset increment.

None
vert_domain str

Vertical domain, one of ('DEPTH', 'TWT'). Defaults to 'TWT'.

'TWT'
vert_units str

Measurement system of of vertical coordinates. One of ('ms', 's', 'm', 'km', 'ft'): Defaults to None for unknown.

None
Source code in segysak/_seismic_dataset.py
Python
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
def create3d_dataset(
    dims,
    first_sample=0,
    sample_rate=1,
    first_iline=1,
    iline_step=1,
    first_xline=1,
    xline_step=1,
    first_offset=None,
    offset_step=None,
    vert_domain="TWT",
    vert_units=None,
):
    """Create a regular 3D seismic dataset from basic grid geometry with optional
    offset dimension for pre-stack data.

    Args:
        dims (tuple of int): The dimensions of the dataset to create (iline, xline, vertical).
            If first_offset is specified then (iline, xline, vertical, offset)
        first_sample (int, optional): The first vertical sample. Defaults to 0.
        sample_rate (int, optional): The vertical sample rate. Defaults to 1.
        first_iline (int, optional): First inline number. Defaults to 1.
        iline_step (int, optional): Inline increment. Defaults to 1.
        first_xline (int, optional): First crossline number. Defaults to 1.
        xline_step (int, optional): Crossline increment. Defaults to 1.
        first_offset (int/float, optional): If not none, the offset dimension will be added starting
            at first offset. Defaults to None.
        offset_step (int, float, optional): Required if first_offset is specified. The offset increment.
        vert_domain (str, optional): Vertical domain, one of ('DEPTH', 'TWT'). Defaults to 'TWT'.
        vert_units (str, optional): Measurement system of of vertical coordinates.
            One of ('ms', 's', 'm', 'km', 'ft'): Defaults to None for unknown.
    """
    vert_domain = vert_domain.upper()

    if first_offset is None:
        ni, nx, ns = dims
    else:
        ni, nx, ns, no = dims

    units = _check_vert_units(vert_units)

    # 1e-10 to stabilise range on rounding errors for weird floats from hypothesis

    vert = np.arange(
        first_sample,
        first_sample + sample_rate * ns - 1e-10,
        sample_rate,
        dtype=int,
    )
    ilines = np.arange(
        first_iline,
        first_iline + iline_step * ni - 1e-10,
        iline_step,
        dtype=int,
    )
    xlines = np.arange(
        first_xline,
        first_xline + xline_step * nx - 1e-10,
        xline_step,
        dtype=int,
    )

    if first_offset is None:
        offset = None
    else:
        offset = np.arange(
            first_offset,
            first_offset + offset_step * no - 1e-10,
            offset_step,
            dtype=float,
        )

    builder, domain = _dataset_coordinate_helper(
        vert, vert_domain, iline=ilines, xline=xlines, offset=offset
    )

    ds = create_seismic_dataset(**builder)
    ds.attrs[AttrKeyField.measurement_system] = units
    ds.attrs[AttrKeyField.d3_domain] = domain
    ds.attrs[AttrKeyField.sample_rate] = sample_rate
    ds.attrs[AttrKeyField.text] = "SEGY-SAK Create 3D Dataset"
    ds.attrs[AttrKeyField.corner_points] = [
        (ilines[0], xlines[0]),
        (ilines[-1], xlines[0]),
        (ilines[-1], xlines[-1]),
        (ilines[0], xlines[-1]),
        (ilines[0], xlines[0]),
    ]

    return ds

create2d_dataset(dims, first_sample=0, sample_rate=1, first_cdp=1, cdp_step=1, first_offset=None, offset_step=None, vert_domain='TWT', vert_units=None)

Create a regular 2D seismic dataset from basic geometry.

Parameters:

Name Type Description Default
dims tuple of int

The dimensions of the dataset to create (ncdp, vertical). If first_offset is specified then (ncdp, vertical, offset)

required
first_sample int

The first vertical sample. Defaults to 0.

0
sample_rate int

The vertical sample rate. Defaults to 1.

1
first_cdp int

First CDP number. Defaults to 1.

1
cdp_step int

CDP increment. Defaults to 1.

1
first_offset int / float

If not none, the offset dimension will be added starting at first offset. Defaults to None.

None
offset_step (int, float)

Required if first_offset is specified. The offset increment.

None
vert_domain str

Vertical domain, one of ('DEPTH', 'TWT'). Defaults to 'TWT'.

'TWT'
vert_units str

Measurement system of of vertical coordinates. One of ('ms', 's', 'm', 'km', 'ft'): Defaults to None for unknown.

None
Source code in segysak/_seismic_dataset.py
Python
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
def create2d_dataset(
    dims,
    first_sample=0,
    sample_rate=1,
    first_cdp=1,
    cdp_step=1,
    first_offset=None,
    offset_step=None,
    vert_domain="TWT",
    vert_units=None,
):
    """Create a regular 2D seismic dataset from basic geometry.

    Args:
        dims (tuple of int): The dimensions of the dataset to create (ncdp, vertical).
            If first_offset is specified then (ncdp, vertical, offset)
        first_sample (int, optional): The first vertical sample. Defaults to 0.
        sample_rate (int, optional): The vertical sample rate. Defaults to 1.
        first_cdp (int, optional): First CDP number. Defaults to 1.
        cdp_step (int, optional): CDP increment. Defaults to 1.
        first_offset (int/float, optional): If not none, the offset dimension will be added starting
            at first offset. Defaults to None.
        offset_step (int, float, optional): Required if first_offset is specified. The offset increment.
        vert_domain (str, optional): Vertical domain, one of ('DEPTH', 'TWT'). Defaults to 'TWT'.
        vert_units (str, optional): Measurement system of of vertical coordinates.
            One of ('ms', 's', 'm', 'km', 'ft'): Defaults to None for unknown.
    """
    vert_domain = vert_domain.upper()

    if first_offset is None:
        ncdp, ns = dims
    else:
        ncdp, ns, no = dims

    # check units
    units = _check_vert_units(vert_units)

    vert = np.arange(
        first_sample, first_sample + sample_rate * ns, sample_rate, dtype=int
    )
    cdps = np.arange(first_cdp, first_cdp + cdp_step * ncdp, cdp_step, dtype=int)

    if first_offset is None:
        offset = None
    else:
        offset = np.arange(
            first_offset, first_offset + offset_step * no, offset_step, dtype=float
        )

    builder, domain = _dataset_coordinate_helper(
        vert, vert_domain, cdp=cdps, offset=offset
    )

    ds = create_seismic_dataset(**builder)
    ds.attrs[AttrKeyField.measurement_system] = units
    ds.attrs[AttrKeyField.d3_domain] = domain
    ds.attrs[AttrKeyField.sample_rate] = sample_rate
    ds.attrs[AttrKeyField.text] = "SEGY-SAK Create 2D Dataset"

    return ds

create_seismic_dataset(twt=None, depth=None, cdp=None, iline=None, xline=None, offset=None, segysak_attr=True, **dim_args)

Create a blank seismic dataset by setting the dimension sizes (d#) or by passing arrays for known dimensions.

iline and xline must be specified together and are mutually exclusive to cdp argument.

Parameters:

Name Type Description Default
twt int / array - like

Two-way time vertical sampling coordinates. Cannot be used with depth argument. Defaults to None.

None
depth int / array - like

Depth vertical sampling coordinates. Cannot be used with twt argument. Defaults to None.

None
cdp int / array - like

The CDP numbering for 2D data, cannot be used with iline or xline. Use for 2D seismic data. Defaults to None.

None
iline int / array - like

The iline numbering, cannot be used with cdp argument. Use for 3D seismic data. Defaults to None.

None
xline int / array - like

The xline numbering, cannot be used with cdp argument. Use for 3D seismic data. Defaults to None.

None
offset int / array - like

The offset. This will fill dimension d4. Use for pre-stack data. Defaults to None.

None
segysak_attr bool

Add SEGYSAK attributes to the Dataset

True
dim_args int / array - like

Other dimensions you would like in your dataset. The key will be the dimension name.

{}

Returns:

Type Description

xarray.Dataset: A dataset with the defined dimensions of input setup to work with seisnc standards.

Source code in segysak/_seismic_dataset.py
Python
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
def create_seismic_dataset(
    twt=None,
    depth=None,
    cdp=None,
    iline=None,
    xline=None,
    offset=None,
    segysak_attr=True,
    **dim_args,
):
    """Create a blank seismic dataset by setting the dimension sizes (d#) or by passing
    arrays for known dimensions.

    iline and xline must be specified together and are mutually exclusive to cdp argument.

    Args:
        twt (int/array-like, optional): Two-way time vertical sampling coordinates.
            Cannot be used with depth argument. Defaults to None.
        depth (int/array-like, optional): Depth vertical sampling coordinates.
            Cannot be used with twt argument. Defaults to None.
        cdp (int/array-like, optional): The CDP numbering for 2D data, cannot be used
            with iline or xline. Use for 2D seismic data. Defaults to None.
        iline (int/array-like, optional): The iline numbering, cannot be
            used with cdp argument. Use for 3D seismic data. Defaults to None.
        xline (int/array-like, optional): The xline numbering, cannot be
            used with cdp argument. Use for 3D seismic data. Defaults to None.
        offset (int/array-like, optional): The offset. This will fill dimension d4.
            Use for pre-stack data. Defaults to None.
        segysak_attr (bool, optional): Add SEGYSAK attributes to the Dataset
        dim_args (int/array-like): Other dimensions you would like in your dataset. The key will be the dimension name.

    Returns:
        xarray.Dataset: A dataset with the defined dimensions of input setup to work with seisnc standards.
    """
    # cdp not allowed with iline or xline
    if cdp is not None and (iline is not None or xline is not None):
        raise ValueError(
            "cdp argument cannot be used with 3D dimensions (iline or xline)"
        )

    if iline is not None and xline is None:
        raise ValueError("xline needed with iline argument, 3d geometry requires both")

    if xline is not None and iline is None:
        raise ValueError("xline needed with iline argument, 3d geometry requires both")

    # check inputs
    twt = _check_input(twt)
    depth = _check_input(depth)
    cdp = _check_input(cdp)
    iline = _check_input(iline)
    xline = _check_input(xline)
    offset = _check_input(offset)
    # # make sure other dim args are sensible and to spec
    for key in dim_args.keys():
        dim_args[key] = _check_input(dim_args[key])

    dimensions = dict()
    # create dimension d1
    if cdp is not None:
        dimensions[DimKeyField.cdp] = ([DimKeyField.cdp], cdp)
    elif iline is not None:  # 3d data
        dimensions[DimKeyField.iline] = ([DimKeyField.iline], iline)
        dimensions[DimKeyField.xline] = ([DimKeyField.xline], xline)

    # create dimension d3
    if twt is not None:
        dimensions[DimKeyField.twt] = ([DimKeyField.twt], twt)
    if depth is not None:
        dimensions[DimKeyField.depth] = ([DimKeyField.depth], depth)

    # create dimension d4
    if offset is not None:
        dimensions[DimKeyField.offset] = (
            [DimKeyField.offset],
            offset,
        )

    # any left over dims
    for arg, val in dim_args.items():
        dimensions[arg] = ([arg], val)

    ds = xr.Dataset(coords=dimensions)

    if segysak_attr:
        if twt is not None:
            ds.attrs[AttrKeyField.ns] = twt.size
        elif depth is not None:
            ds.attrs[AttrKeyField.ns] = depth.size

        ds.attrs.update({name: None for name in AttrKeyField})

    return ds