Working with SEG-Y headers¶
Headers in SEG-Y data are additional meta information associated with each trace. In SEG-Y these are not pooled in a common data block but interleaved with the seismic trace data so we need to do some work to extract it. segysak has two helper methods for extracting information from a SEG-Y file. These are segy_header_scan
and segy_header_scrape
. Both of these functions return pandas.DataFrame
objects containing header or header related information which can be used in QC, analysis and plotting.
Scanning the headers¶
segy_header_scan
is primarily designed to help quickly asscertain the byte locations of key header information for loading or converting the full SEG-Y file. It does this by just looking at the first N traces (1000 by default) and returns the byte location and statistics related to the file.
from segysak.segy import segy_header_scan
# default just needs the file name
scan = segy_header_scan("data/volve10r12-full-twt-sub3d.sgy")
scan
byte_loc | count | mean | std | min | 25% | 50% | 75% | max | |
---|---|---|---|---|---|---|---|---|---|
TRACE_SEQUENCE_LINE | 1 | 1000.0 | 100.54 | 57.831072 | 1.0 | 50.75 | 100.5 | 150.25 | 202.0 |
TRACE_SEQUENCE_FILE | 5 | 1000.0 | 10091.98 | 1.407687 | 10090.0 | 10091.00 | 10092.0 | 10093.00 | 10094.0 |
FieldRecord | 9 | 1000.0 | 10091.98 | 1.407687 | 10090.0 | 10091.00 | 10092.0 | 10093.00 | 10094.0 |
TraceNumber | 13 | 1000.0 | 100.54 | 57.831072 | 1.0 | 50.75 | 100.5 | 150.25 | 202.0 |
EnergySourcePoint | 17 | 1000.0 | 0.00 | 0.000000 | 0.0 | 0.00 | 0.0 | 0.00 | 0.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
SourceMeasurementMantissa | 225 | 1000.0 | 0.00 | 0.000000 | 0.0 | 0.00 | 0.0 | 0.00 | 0.0 |
SourceMeasurementExponent | 229 | 1000.0 | 0.00 | 0.000000 | 0.0 | 0.00 | 0.0 | 0.00 | 0.0 |
SourceMeasurementUnit | 231 | 1000.0 | 0.00 | 0.000000 | 0.0 | 0.00 | 0.0 | 0.00 | 0.0 |
UnassignedInt1 | 233 | 1000.0 | 0.00 | 0.000000 | 0.0 | 0.00 | 0.0 | 0.00 | 0.0 |
UnassignedInt2 | 237 | 1000.0 | 0.00 | 0.000000 | 0.0 | 0.00 | 0.0 | 0.00 | 0.0 |
91 rows × 9 columns
If you want to see the full DataFrame in a notebook, use the pandas
options context manager.
import pandas as pd
from IPython.display import display
with pd.option_context("display.max_rows", 91):
display(scan)
byte_loc | count | mean | std | min | 25% | 50% | 75% | max | |
---|---|---|---|---|---|---|---|---|---|
TRACE_SEQUENCE_LINE | 1 | 1000.0 | 1.005400e+02 | 57.831072 | 1.0 | 5.075000e+01 | 100.5 | 1.502500e+02 | 202.0 |
TRACE_SEQUENCE_FILE | 5 | 1000.0 | 1.009198e+04 | 1.407687 | 10090.0 | 1.009100e+04 | 10092.0 | 1.009300e+04 | 10094.0 |
FieldRecord | 9 | 1000.0 | 1.009198e+04 | 1.407687 | 10090.0 | 1.009100e+04 | 10092.0 | 1.009300e+04 | 10094.0 |
TraceNumber | 13 | 1000.0 | 1.005400e+02 | 57.831072 | 1.0 | 5.075000e+01 | 100.5 | 1.502500e+02 | 202.0 |
EnergySourcePoint | 17 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
CDP | 21 | 1000.0 | 2.249540e+03 | 57.831072 | 2150.0 | 2.199750e+03 | 2249.5 | 2.299250e+03 | 2351.0 |
CDP_TRACE | 25 | 1000.0 | 1.000000e+00 | 0.000000 | 1.0 | 1.000000e+00 | 1.0 | 1.000000e+00 | 1.0 |
TraceIdentificationCode | 29 | 1000.0 | 1.000000e+00 | 0.000000 | 1.0 | 1.000000e+00 | 1.0 | 1.000000e+00 | 1.0 |
NSummedTraces | 31 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
NStackedTraces | 33 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
DataUse | 35 | 1000.0 | 1.000000e+00 | 0.000000 | 1.0 | 1.000000e+00 | 1.0 | 1.000000e+00 | 1.0 |
offset | 37 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
ReceiverGroupElevation | 41 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SourceSurfaceElevation | 45 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SourceDepth | 49 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
ReceiverDatumElevation | 53 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SourceDatumElevation | 57 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SourceWaterDepth | 61 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
GroupWaterDepth | 65 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
ElevationScalar | 69 | 1000.0 | 1.000000e+00 | 0.000000 | 1.0 | 1.000000e+00 | 1.0 | 1.000000e+00 | 1.0 |
SourceGroupScalar | 71 | 1000.0 | -1.000000e+02 | 0.000000 | -100.0 | -1.000000e+02 | -100.0 | -1.000000e+02 | -100.0 |
SourceX | 73 | 1000.0 | 4.351992e+07 | 70152.496037 | 43396267.0 | 4.345933e+07 | 43519976.5 | 4.358062e+07 | 43641261.0 |
SourceY | 77 | 1000.0 | 6.477772e+08 | 17532.885301 | 647744704.0 | 6.477622e+08 | 647777222.0 | 6.477923e+08 | 647809133.0 |
GroupX | 81 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
GroupY | 85 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
CoordinateUnits | 89 | 1000.0 | 1.000000e+00 | 0.000000 | 1.0 | 1.000000e+00 | 1.0 | 1.000000e+00 | 1.0 |
WeatheringVelocity | 91 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SubWeatheringVelocity | 93 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SourceUpholeTime | 95 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
GroupUpholeTime | 97 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SourceStaticCorrection | 99 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
GroupStaticCorrection | 101 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
TotalStaticApplied | 103 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
LagTimeA | 105 | 1000.0 | 4.000000e+00 | 0.000000 | 4.0 | 4.000000e+00 | 4.0 | 4.000000e+00 | 4.0 |
LagTimeB | 107 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
DelayRecordingTime | 109 | 1000.0 | 4.000000e+00 | 0.000000 | 4.0 | 4.000000e+00 | 4.0 | 4.000000e+00 | 4.0 |
MuteTimeStart | 111 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
MuteTimeEND | 113 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
TRACE_SAMPLE_COUNT | 115 | 1000.0 | 8.500000e+02 | 0.000000 | 850.0 | 8.500000e+02 | 850.0 | 8.500000e+02 | 850.0 |
TRACE_SAMPLE_INTERVAL | 117 | 1000.0 | 4.000000e+03 | 0.000000 | 4000.0 | 4.000000e+03 | 4000.0 | 4.000000e+03 | 4000.0 |
GainType | 119 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
InstrumentGainConstant | 121 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
InstrumentInitialGain | 123 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
Correlated | 125 | 1000.0 | 1.000000e+00 | 0.000000 | 1.0 | 1.000000e+00 | 1.0 | 1.000000e+00 | 1.0 |
SweepFrequencyStart | 127 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SweepFrequencyEnd | 129 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SweepLength | 131 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SweepType | 133 | 1000.0 | 1.000000e+00 | 0.000000 | 1.0 | 1.000000e+00 | 1.0 | 1.000000e+00 | 1.0 |
SweepTraceTaperLengthStart | 135 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SweepTraceTaperLengthEnd | 137 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
TaperType | 139 | 1000.0 | 1.000000e+00 | 0.000000 | 1.0 | 1.000000e+00 | 1.0 | 1.000000e+00 | 1.0 |
AliasFilterFrequency | 141 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
AliasFilterSlope | 143 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
NotchFilterFrequency | 145 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
NotchFilterSlope | 147 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
LowCutFrequency | 149 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
HighCutFrequency | 151 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
LowCutSlope | 153 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
HighCutSlope | 155 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
YearDataRecorded | 157 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
DayOfYear | 159 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
HourOfDay | 161 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
MinuteOfHour | 163 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SecondOfMinute | 165 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
TimeBaseCode | 167 | 1000.0 | 1.000000e+00 | 0.000000 | 1.0 | 1.000000e+00 | 1.0 | 1.000000e+00 | 1.0 |
TraceWeightingFactor | 169 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
GeophoneGroupNumberRoll1 | 171 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
GeophoneGroupNumberFirstTraceOrigField | 173 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
GeophoneGroupNumberLastTraceOrigField | 175 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
GapSize | 177 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
OverTravel | 179 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
CDP_X | 181 | 1000.0 | 4.351992e+07 | 70152.496037 | 43396267.0 | 4.345933e+07 | 43519976.5 | 4.358062e+07 | 43641261.0 |
CDP_Y | 185 | 1000.0 | 6.477772e+08 | 17532.885301 | 647744704.0 | 6.477622e+08 | 647777222.0 | 6.477923e+08 | 647809133.0 |
INLINE_3D | 189 | 1000.0 | 1.009198e+04 | 1.407687 | 10090.0 | 1.009100e+04 | 10092.0 | 1.009300e+04 | 10094.0 |
CROSSLINE_3D | 193 | 1000.0 | 2.249540e+03 | 57.831072 | 2150.0 | 2.199750e+03 | 2249.5 | 2.299250e+03 | 2351.0 |
ShotPoint | 197 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
ShotPointScalar | 201 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
TraceValueMeasurementUnit | 203 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
TransductionConstantMantissa | 205 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
TransductionConstantPower | 209 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
TransductionUnit | 211 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
TraceIdentifier | 213 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
ScalarTraceHeader | 215 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SourceType | 217 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SourceEnergyDirectionMantissa | 219 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SourceEnergyDirectionExponent | 223 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SourceMeasurementMantissa | 225 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SourceMeasurementExponent | 229 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
SourceMeasurementUnit | 231 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
UnassignedInt1 | 233 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
UnassignedInt2 | 237 | 1000.0 | 0.000000e+00 | 0.000000 | 0.0 | 0.000000e+00 | 0.0 | 0.000000e+00 | 0.0 |
Often lots of header fields don't get filled, so lets filter by the standard deviation column std
. In fact, there are so few here we don't need the context manager. As you can see, for segy_loader
or segy_converter
we will need to tell those functions that the byte location for iline and xline are 189 and 193 respectively, and the byte locations for cdp_x and cdp_y are either 73 and 77 or 181 and 185 which are identical pairs.
# NIIIICCCEEEE...
scan[scan["std"] > 0]
byte_loc | count | mean | std | min | 25% | 50% | 75% | max | |
---|---|---|---|---|---|---|---|---|---|
TRACE_SEQUENCE_LINE | 1 | 1000.0 | 1.005400e+02 | 57.831072 | 1.0 | 5.075000e+01 | 100.5 | 1.502500e+02 | 202.0 |
TRACE_SEQUENCE_FILE | 5 | 1000.0 | 1.009198e+04 | 1.407687 | 10090.0 | 1.009100e+04 | 10092.0 | 1.009300e+04 | 10094.0 |
FieldRecord | 9 | 1000.0 | 1.009198e+04 | 1.407687 | 10090.0 | 1.009100e+04 | 10092.0 | 1.009300e+04 | 10094.0 |
TraceNumber | 13 | 1000.0 | 1.005400e+02 | 57.831072 | 1.0 | 5.075000e+01 | 100.5 | 1.502500e+02 | 202.0 |
CDP | 21 | 1000.0 | 2.249540e+03 | 57.831072 | 2150.0 | 2.199750e+03 | 2249.5 | 2.299250e+03 | 2351.0 |
SourceX | 73 | 1000.0 | 4.351992e+07 | 70152.496037 | 43396267.0 | 4.345933e+07 | 43519976.5 | 4.358062e+07 | 43641261.0 |
SourceY | 77 | 1000.0 | 6.477772e+08 | 17532.885301 | 647744704.0 | 6.477622e+08 | 647777222.0 | 6.477923e+08 | 647809133.0 |
CDP_X | 181 | 1000.0 | 4.351992e+07 | 70152.496037 | 43396267.0 | 4.345933e+07 | 43519976.5 | 4.358062e+07 | 43641261.0 |
CDP_Y | 185 | 1000.0 | 6.477772e+08 | 17532.885301 | 647744704.0 | 6.477622e+08 | 647777222.0 | 6.477923e+08 | 647809133.0 |
INLINE_3D | 189 | 1000.0 | 1.009198e+04 | 1.407687 | 10090.0 | 1.009100e+04 | 10092.0 | 1.009300e+04 | 10094.0 |
CROSSLINE_3D | 193 | 1000.0 | 2.249540e+03 | 57.831072 | 2150.0 | 2.199750e+03 | 2249.5 | 2.299250e+03 | 2351.0 |
Scraping Headers¶
Scraping the header works like a scan but instead of statistics we get a DataFrame of actual trace header values. You can reduce the size of the scan by using the partial_scan keyword if required. The index of the DataFrame is the trace index and the columns are the header fields.
from segysak.segy import segy_header_scrape
scrape = segy_header_scrape("data/volve10r12-full-twt-sub3d.sgy", partial_scan=10000)
scrape
TRACE_SEQUENCE_LINE | TRACE_SEQUENCE_FILE | FieldRecord | TraceNumber | EnergySourcePoint | CDP | CDP_TRACE | TraceIdentificationCode | NSummedTraces | NStackedTraces | ... | TraceIdentifier | ScalarTraceHeader | SourceType | SourceEnergyDirectionMantissa | SourceEnergyDirectionExponent | SourceMeasurementMantissa | SourceMeasurementExponent | SourceMeasurementUnit | UnassignedInt1 | UnassignedInt2 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 10090 | 10090 | 1 | 0 | 2150 | 1 | 1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 2 | 10090 | 10090 | 2 | 0 | 2151 | 1 | 1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 3 | 10090 | 10090 | 3 | 0 | 2152 | 1 | 1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
3 | 4 | 10090 | 10090 | 4 | 0 | 2153 | 1 | 1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
4 | 5 | 10090 | 10090 | 5 | 0 | 2154 | 1 | 1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9995 | 98 | 10139 | 10139 | 98 | 0 | 2247 | 1 | 1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
9996 | 99 | 10139 | 10139 | 99 | 0 | 2248 | 1 | 1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
9997 | 100 | 10139 | 10139 | 100 | 0 | 2249 | 1 | 1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
9998 | 101 | 10139 | 10139 | 101 | 0 | 2250 | 1 | 1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
9999 | 102 | 10139 | 10139 | 102 | 0 | 2251 | 1 | 1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
10000 rows × 91 columns
We know from the scan that many of these fields were empty so lets go ahead and filter our scrape by using the standard deviation again and passing the index which is the same as our column names.
scrape = scrape[scan[scan["std"] > 0].index]
scrape
TRACE_SEQUENCE_LINE | TRACE_SEQUENCE_FILE | FieldRecord | TraceNumber | CDP | SourceX | SourceY | CDP_X | CDP_Y | INLINE_3D | CROSSLINE_3D | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 10090 | 10090 | 1 | 2150 | 43640052 | 647744704 | 43640052 | 647744704 | 10090 | 2150 |
1 | 2 | 10090 | 10090 | 2 | 2151 | 43638839 | 647745006 | 43638839 | 647745006 | 10090 | 2151 |
2 | 3 | 10090 | 10090 | 3 | 2152 | 43637626 | 647745309 | 43637626 | 647745309 | 10090 | 2152 |
3 | 4 | 10090 | 10090 | 4 | 2153 | 43636413 | 647745611 | 43636413 | 647745611 | 10090 | 2153 |
4 | 5 | 10090 | 10090 | 5 | 2154 | 43635200 | 647745914 | 43635200 | 647745914 | 10090 | 2154 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9995 | 98 | 10139 | 10139 | 98 | 2247 | 43537224 | 647833471 | 43537224 | 647833471 | 10139 | 2247 |
9996 | 99 | 10139 | 10139 | 99 | 2248 | 43536011 | 647833773 | 43536011 | 647833773 | 10139 | 2248 |
9997 | 100 | 10139 | 10139 | 100 | 2249 | 43534798 | 647834076 | 43534798 | 647834076 | 10139 | 2249 |
9998 | 101 | 10139 | 10139 | 101 | 2250 | 43533585 | 647834378 | 43533585 | 647834378 | 10139 | 2250 |
9999 | 102 | 10139 | 10139 | 102 | 2251 | 43532372 | 647834681 | 43532372 | 647834681 | 10139 | 2251 |
10000 rows × 11 columns
We know from the scan that many of these fields were empty so lets go ahead and filter our scrape by using the standard deviation again and passing the index which is the same as our column names.
import matplotlib.pyplot as plt
%matplotlib inline
plot = scrape.hist(bins=25, figsize=(20, 10))
We can also just plot up the geometry to check that everything looks ok, here the line numbering and coordinates seem to match up, great!
fig, axs = plt.subplots(nrows=2, figsize=(12, 10), sharex=True, sharey=True)
scrape.plot(
kind="scatter", x="CDP_X", y="CDP_Y", c="INLINE_3D", ax=axs[0], cmap="gist_ncar"
)
scrape.plot(
kind="scatter", x="CDP_X", y="CDP_Y", c="CROSSLINE_3D", ax=axs[1], cmap="gist_ncar"
)
for aa in axs:
aa.set_aspect("equal", "box")