Skip to content

Latest commit

 

History

History
499 lines (381 loc) · 13.5 KB

File metadata and controls

499 lines (381 loc) · 13.5 KB

High-Level Wrapper Function

The diwasp function is the main entry point for analyzing wave data. It provides a simple interface for processing continuous sensor data over multiple time windows and returns wavespectra-compatible output.

Overview

from diwasp import diwasp

result = diwasp(
    data,                    # DataFrame or Dataset
    sensor_mapping,          # Map variable names to sensor types
    window_length,           # Analysis window in seconds
    window_overlap,          # Overlap in seconds
    depth,                   # Water depth in meters
    method='imlm',           # Estimation method
)

Input Data Formats

pandas DataFrame

For DataFrame input:

  • Index: Must be a DatetimeIndex
  • Columns: Sensor variables (pressure, velocity, etc.)
import pandas as pd
from diwasp import diwasp

# Load data with datetime index
df = pd.read_csv('wave_data.csv', index_col='time', parse_dates=True)

# Or create from scratch
time = pd.date_range('2024-01-01', periods=7200, freq='500ms')
df = pd.DataFrame({
    'pressure': pressure_data,
    'u_velocity': u_data,
    'v_velocity': v_data,
}, index=time)

result = diwasp(
    df,
    sensor_mapping={'pressure': 'pres', 'u_velocity': 'velx', 'v_velocity': 'vely'},
    window_length=1800,
    window_overlap=900,
    depth=20.0,
)

xarray Dataset

For Dataset input:

  • Time dimension: Must contain datetime values
  • Data variables: Sensor measurements
import xarray as xr
from diwasp import diwasp

ds = xr.Dataset({
    'pres': (['time'], pressure_data),
    'velx': (['time'], u_data),
    'vely': (['time'], v_data),
}, coords={
    'time': time_values,  # datetime64 array
})

result = diwasp(
    ds,
    sensor_mapping={'pres': 'pres', 'velx': 'velx', 'vely': 'vely'},
    window_length=1800,
    window_overlap=900,
    depth=20.0,
)

Parameters

Required Parameters

Parameter Type Description
data DataFrame or Dataset Input sensor data
sensor_mapping dict Maps variable names to sensor types
window_length float Analysis window length in seconds
window_overlap float Overlap between windows in seconds
depth float Water depth in meters

Sensor Mapping

The sensor_mapping parameter maps your variable/column names to DIWASP sensor types:

sensor_mapping = {
    'my_pressure_col': 'pres',    # Pressure sensor
    'east_velocity': 'velx',       # East velocity component
    'north_velocity': 'vely',      # North velocity component
}

Available sensor types:

Type Description
'elev' Surface elevation
'pres' Pressure
'velx' X-component velocity
'vely' Y-component velocity
'velz' Z-component velocity
'vels' Surface vertical velocity
'accs' Surface vertical acceleration
'slpx' X-component surface slope
'slpy' Y-component surface slope
'accx' X-component acceleration
'accy' Y-component acceleration
'accz' Z-component acceleration
'dspx' X displacement
'dspy' Y displacement

Optional Parameters

Parameter Type Default Description
method str 'imlm' Estimation method: 'dftm', 'emlm', 'imlm', 'emep', 'bdm'
time_var str 'time' Name of time dimension (Dataset only)
x_var str 'x' Name of x-coordinate variable
y_var str 'y' Name of y-coordinate variable
z_var str 'z' Name of z-coordinate variable
z float or dict None Sensor z-positions (height above seabed)
x float or dict None Sensor x-positions
y float or dict None Sensor y-positions
fs float None Sampling frequency (inferred if None)
freqs ndarray None Output frequency grid
dirs ndarray None Output direction grid
dres int 180 Directional resolution
nfft int None FFT length
smooth bool True Apply spectral smoothing
verbose int 1 Verbosity level (0, 1, 2)

Sensor Positions

Single Position for All Sensors

If all sensors are co-located:

result = diwasp(
    df,
    sensor_mapping={'p': 'pres', 'u': 'velx', 'v': 'vely'},
    window_length=1800,
    window_overlap=900,
    depth=20.0,
    z=0.5,  # All sensors 0.5m above seabed
)

Different Positions per Sensor

For sensors at different locations:

result = diwasp(
    df,
    sensor_mapping={'p': 'pres', 'u': 'velx', 'v': 'vely'},
    window_length=1800,
    window_overlap=900,
    depth=20.0,
    x={'p': 0, 'u': 0, 'v': 0},
    y={'p': 0, 'u': 0, 'v': 0},
    z={'p': 0.5, 'u': 1.0, 'v': 1.0},
)

Positions from Dataset Attributes

For xarray Datasets, positions can be stored as variable attributes:

ds = xr.Dataset({
    'pres': (['time'], data, {'x': 0, 'y': 0, 'z': 0.5}),
    'velx': (['time'], data, {'x': 0, 'y': 0, 'z': 1.0}),
    'vely': (['time'], data, {'x': 0, 'y': 0, 'z': 1.0}),
})

# Positions are read automatically from attributes
result = diwasp(ds, sensor_mapping={...}, ...)

Output Format

The function returns a wavespectra-compatible xarray Dataset:

Dimensions

  • time: Center time of each analysis window
  • freq: Frequency bins (Hz)
  • dir: Direction bins (degrees)

Variables

Variable Dimensions Units Description
efth (time, freq, dir) m²/Hz/degree Spectral energy density
hsig (time) m Significant wave height
tp (time) s Peak period
fp (time) Hz Peak frequency
dp (time) degree Peak direction
dm (time) degree Mean direction
spread (time) degree Directional spread

Example Output Usage

result = diwasp(df, ...)

# Access spectral data
spectrum = result['efth']  # 3D array (time, freq, dir)

# Access statistics time series
hsig = result['hsig'].values  # 1D array
tp = result['tp'].values

# Plot Hsig time series
import matplotlib.pyplot as plt
result['hsig'].plot()
plt.ylabel('Hsig (m)')
plt.show()

# Save to NetCDF
result.to_netcdf('wave_spectra.nc')

Window Configuration

Window Length

The window_length parameter sets the duration of each analysis window in seconds. Longer windows provide better frequency resolution but fewer output time points.

Frequency resolution ≈ 1 / window_length

Typical values:

  • 1800 s (30 min): Standard for ocean waves
  • 600-1200 s: Higher time resolution
  • 3600 s (1 hour): Better frequency resolution

Window Overlap

The window_overlap parameter controls how much consecutive windows overlap:

  • 0: No overlap, independent windows
  • window_length/2: 50% overlap (common choice)
  • window_length * 0.75: 75% overlap for smoother time series
# 30-minute windows with 50% overlap
result = diwasp(
    df,
    sensor_mapping=mapping,
    window_length=1800,
    window_overlap=900,
    depth=20.0,
)

Number of Windows

The number of output windows is calculated as:

n_windows = 1 + (data_length - window_length) / (window_length - window_overlap)

Sampling Frequency

The sampling frequency is automatically inferred from the time index. The function will automatically resample data if:

  1. The time index has non-uniform spacing (irregular sampling)
  2. An explicit fs parameter is provided that differs from the inferred frequency

Automatic Resampling

If resampling is needed, the data is interpolated linearly to a uniform grid:

# Data with irregular sampling will be automatically resampled
result = diwasp(
    df_irregular,  # Non-uniform time spacing
    sensor_mapping=mapping,
    window_length=1800,
    window_overlap=900,
    depth=20.0,
)

Override Sampling Frequency

To force resampling to a specific frequency:

result = diwasp(
    df,
    sensor_mapping=mapping,
    window_length=1800,
    window_overlap=900,
    depth=20.0,
    fs=2.0,  # Force 2 Hz sampling frequency (triggers resampling if different)
)

Estimation Methods

See Estimation Methods for detailed information on each method.

Quick guide:

  • 'dftm': Fastest, lowest accuracy
  • 'emlm': Fast, good for narrow spectra
  • 'imlm': Good balance (default)
  • 'emep': Robust, handles noise well
  • 'bdm': Best accuracy, slowest
# Use EMEP for noisy data
result = diwasp(
    df,
    sensor_mapping=mapping,
    window_length=1800,
    window_overlap=900,
    depth=20.0,
    method='emep',
)

Complete Examples

PUV Sensor Analysis

import pandas as pd
from diwasp import diwasp

# Load PUV data
df = pd.read_csv('puv_data.csv', index_col='time', parse_dates=True)

# Analyze with IMLM method
result = diwasp(
    df,
    sensor_mapping={
        'pressure': 'pres',
        'u_velocity': 'velx',
        'v_velocity': 'vely',
    },
    window_length=1800,      # 30 minutes
    window_overlap=900,       # 15 minutes overlap
    depth=15.0,              # 15m water depth
    z=0.5,                   # Sensors 0.5m above seabed
    method='imlm',
    dres=180,                # 2-degree directional resolution
    verbose=1,
)

# Print summary statistics
print(f"Number of spectra: {len(result.time)}")
print(f"Hsig range: {result.hsig.min().values:.2f} - {result.hsig.max().values:.2f} m")
print(f"Tp range: {result.tp.min().values:.1f} - {result.tp.max().values:.1f} s")

# Save results
result.to_netcdf('wave_spectra.nc')

Pressure Gauge Array

import pandas as pd
from diwasp import diwasp

# Load data from 3 pressure gauges
df = pd.read_csv('pressure_array.csv', index_col='time', parse_dates=True)

# Define sensor positions (triangular array)
result = diwasp(
    df,
    sensor_mapping={
        'p1': 'pres',
        'p2': 'pres',
        'p3': 'pres',
    },
    window_length=1800,
    window_overlap=900,
    depth=10.0,
    x={'p1': 0, 'p2': 5, 'p3': -5},
    y={'p1': 0, 'p2': 5, 'p3': 5},
    z=0.0,  # All on seabed
    method='emep',
)

Integration with wavespectra

from diwasp import diwasp
import wavespectra

# Run DIWASP analysis
result = diwasp(df, sensor_mapping=mapping, ...)

# Use wavespectra methods
hs = result.spec.hs()  # Significant wave height
tp = result.spec.tp()  # Peak period

# Plot using wavespectra
result.spec.plot()

Input Validation

The function validates inputs and will raise errors for:

Empty Sensor Mapping

# This will raise ValueError
result = diwasp(df, sensor_mapping={}, ...)

Error: ValueError: sensor_mapping cannot be empty

Invalid Depth

# This will raise ValueError
result = diwasp(df, sensor_mapping=mapping, depth=-10.0, ...)

Error: ValueError: depth must be positive

Invalid Window Overlap

# This will raise ValueError
result = diwasp(
    df,
    sensor_mapping=mapping,
    window_length=300,
    window_overlap=300,  # Must be less than window_length
    depth=20.0,
)

Error: ValueError: window_overlap must be less than window_length

Troubleshooting

"DatetimeIndex" Error

ValueError: DataFrame index must be DatetimeIndex

Solution: Ensure your DataFrame has a datetime index:

df.index = pd.to_datetime(df.index)
# or
df = pd.read_csv('data.csv', index_col='time', parse_dates=True)

"Window exceeds data length" Error

Solution: Reduce window length or get more data:

# Check data length
print(f"Data duration: {(df.index[-1] - df.index[0]).total_seconds()} seconds")

# Use shorter windows
result = diwasp(df, ..., window_length=600, ...)  # 10 minutes

Poor Results

  1. Check sensor positions are correct
  2. Try a more robust method ('emep' or 'bdm')
  3. Verify data quality (no gaps, correct units)
  4. Ensure sufficient sensors for directional resolution

Slow Performance

  1. Use faster method ('imlm' or 'emlm')
  2. Reduce directional resolution (dres=90)
  3. Limit frequency range with freqs parameter
  4. Reduce nfft for faster FFT computation