High-Level Wrapper Function

The diwasp function is the main entry point for analyzing wave data. It provides a simple interface for processing continuous sensor data over multiple time windows and returns wavespectra-compatible output.

Overview

from diwasp import diwasp

result = diwasp(
    data,                    # DataFrame or Dataset
    sensor_mapping,          # Map variable names to sensor types
    window_length,           # Analysis window in seconds
    window_overlap,          # Overlap in seconds
    depth,                   # Water depth in meters
    method='imlm',           # Estimation method
)

Input Data Formats

pandas DataFrame

For DataFrame input:

Index: Must be a DatetimeIndex
Columns: Sensor variables (pressure, velocity, etc.)

import pandas as pd
from diwasp import diwasp

# Load data with datetime index
df = pd.read_csv('wave_data.csv', index_col='time', parse_dates=True)

# Or create from scratch
time = pd.date_range('2024-01-01', periods=7200, freq='500ms')
df = pd.DataFrame({
    'pressure': pressure_data,
    'u_velocity': u_data,
    'v_velocity': v_data,
}, index=time)

result = diwasp(
    df,
    sensor_mapping={'pressure': 'pres', 'u_velocity': 'velx', 'v_velocity': 'vely'},
    window_length=1800,
    window_overlap=900,
    depth=20.0,
)

xarray Dataset

For Dataset input:

Time dimension: Must contain datetime values
Data variables: Sensor measurements

import xarray as xr
from diwasp import diwasp

ds = xr.Dataset({
    'pres': (['time'], pressure_data),
    'velx': (['time'], u_data),
    'vely': (['time'], v_data),
}, coords={
    'time': time_values,  # datetime64 array
})

result = diwasp(
    ds,
    sensor_mapping={'pres': 'pres', 'velx': 'velx', 'vely': 'vely'},
    window_length=1800,
    window_overlap=900,
    depth=20.0,
)

Parameters

Required Parameters

Parameter	Type	Description
`data`	DataFrame or Dataset	Input sensor data
`sensor_mapping`	dict	Maps variable names to sensor types
`window_length`	float	Analysis window length in seconds
`window_overlap`	float	Overlap between windows in seconds
`depth`	float	Water depth in meters

Sensor Mapping

The sensor_mapping parameter maps your variable/column names to DIWASP sensor types:

sensor_mapping = {
    'my_pressure_col': 'pres',    # Pressure sensor
    'east_velocity': 'velx',       # East velocity component
    'north_velocity': 'vely',      # North velocity component
}

Available sensor types:

Type	Description
`'elev'`	Surface elevation
`'pres'`	Pressure
`'velx'`	X-component velocity
`'vely'`	Y-component velocity
`'velz'`	Z-component velocity
`'vels'`	Surface vertical velocity
`'accs'`	Surface vertical acceleration
`'slpx'`	X-component surface slope
`'slpy'`	Y-component surface slope
`'accx'`	X-component acceleration
`'accy'`	Y-component acceleration
`'accz'`	Z-component acceleration
`'dspx'`	X displacement
`'dspy'`	Y displacement

Optional Parameters

Parameter	Type	Default	Description
`method`	str	`'imlm'`	Estimation method: `'dftm'`, `'emlm'`, `'imlm'`, `'emep'`, `'bdm'`
`time_var`	str	`'time'`	Name of time dimension (Dataset only)
`x_var`	str	`'x'`	Name of x-coordinate variable
`y_var`	str	`'y'`	Name of y-coordinate variable
`z_var`	str	`'z'`	Name of z-coordinate variable
`z`	float or dict	None	Sensor z-positions (height above seabed)
`x`	float or dict	None	Sensor x-positions
`y`	float or dict	None	Sensor y-positions
`fs`	float	None	Sampling frequency (inferred if None)
`freqs`	ndarray	None	Output frequency grid
`dirs`	ndarray	None	Output direction grid
`dres`	int	180	Directional resolution
`nfft`	int	None	FFT length
`smooth`	bool	True	Apply spectral smoothing
`verbose`	int	1	Verbosity level (0, 1, 2)

Sensor Positions

Single Position for All Sensors

If all sensors are co-located:

result = diwasp(
    df,
    sensor_mapping={'p': 'pres', 'u': 'velx', 'v': 'vely'},
    window_length=1800,
    window_overlap=900,
    depth=20.0,
    z=0.5,  # All sensors 0.5m above seabed
)

Different Positions per Sensor

For sensors at different locations:

result = diwasp(
    df,
    sensor_mapping={'p': 'pres', 'u': 'velx', 'v': 'vely'},
    window_length=1800,
    window_overlap=900,
    depth=20.0,
    x={'p': 0, 'u': 0, 'v': 0},
    y={'p': 0, 'u': 0, 'v': 0},
    z={'p': 0.5, 'u': 1.0, 'v': 1.0},
)

Positions from Dataset Attributes

For xarray Datasets, positions can be stored as variable attributes:

ds = xr.Dataset({
    'pres': (['time'], data, {'x': 0, 'y': 0, 'z': 0.5}),
    'velx': (['time'], data, {'x': 0, 'y': 0, 'z': 1.0}),
    'vely': (['time'], data, {'x': 0, 'y': 0, 'z': 1.0}),
})

# Positions are read automatically from attributes
result = diwasp(ds, sensor_mapping={...}, ...)

Output Format

The function returns a wavespectra-compatible xarray Dataset:

Dimensions

time: Center time of each analysis window
freq: Frequency bins (Hz)
dir: Direction bins (degrees)

Variables

Variable	Dimensions	Units	Description
`efth`	(time, freq, dir)	m²/Hz/degree	Spectral energy density
`hsig`	(time)	m	Significant wave height
`tp`	(time)	s	Peak period
`fp`	(time)	Hz	Peak frequency
`dp`	(time)	degree	Peak direction
`dm`	(time)	degree	Mean direction
`spread`	(time)	degree	Directional spread

Example Output Usage

result = diwasp(df, ...)

# Access spectral data
spectrum = result['efth']  # 3D array (time, freq, dir)

# Access statistics time series
hsig = result['hsig'].values  # 1D array
tp = result['tp'].values

# Plot Hsig time series
import matplotlib.pyplot as plt
result['hsig'].plot()
plt.ylabel('Hsig (m)')
plt.show()

# Save to NetCDF
result.to_netcdf('wave_spectra.nc')

Window Configuration

Window Length

The window_length parameter sets the duration of each analysis window in seconds. Longer windows provide better frequency resolution but fewer output time points.

Frequency resolution ≈ 1 / window_length

Typical values:

1800 s (30 min): Standard for ocean waves
600-1200 s: Higher time resolution
3600 s (1 hour): Better frequency resolution

Window Overlap

The window_overlap parameter controls how much consecutive windows overlap:

0: No overlap, independent windows
window_length/2: 50% overlap (common choice)
window_length * 0.75: 75% overlap for smoother time series

# 30-minute windows with 50% overlap
result = diwasp(
    df,
    sensor_mapping=mapping,
    window_length=1800,
    window_overlap=900,
    depth=20.0,
)

Number of Windows

The number of output windows is calculated as:

n_windows = 1 + (data_length - window_length) / (window_length - window_overlap)

Sampling Frequency

The sampling frequency is automatically inferred from the time index. The function will automatically resample data if:

The time index has non-uniform spacing (irregular sampling)
An explicit fs parameter is provided that differs from the inferred frequency

Automatic Resampling

If resampling is needed, the data is interpolated linearly to a uniform grid:

# Data with irregular sampling will be automatically resampled
result = diwasp(
    df_irregular,  # Non-uniform time spacing
    sensor_mapping=mapping,
    window_length=1800,
    window_overlap=900,
    depth=20.0,
)

Override Sampling Frequency

To force resampling to a specific frequency:

result = diwasp(
    df,
    sensor_mapping=mapping,
    window_length=1800,
    window_overlap=900,
    depth=20.0,
    fs=2.0,  # Force 2 Hz sampling frequency (triggers resampling if different)
)

Estimation Methods

See Estimation Methods for detailed information on each method.

Quick guide:

'dftm': Fastest, lowest accuracy
'emlm': Fast, good for narrow spectra
'imlm': Good balance (default)
'emep': Robust, handles noise well
'bdm': Best accuracy, slowest

# Use EMEP for noisy data
result = diwasp(
    df,
    sensor_mapping=mapping,
    window_length=1800,
    window_overlap=900,
    depth=20.0,
    method='emep',
)

Complete Examples

PUV Sensor Analysis

import pandas as pd
from diwasp import diwasp

# Load PUV data
df = pd.read_csv('puv_data.csv', index_col='time', parse_dates=True)

# Analyze with IMLM method
result = diwasp(
    df,
    sensor_mapping={
        'pressure': 'pres',
        'u_velocity': 'velx',
        'v_velocity': 'vely',
    },
    window_length=1800,      # 30 minutes
    window_overlap=900,       # 15 minutes overlap
    depth=15.0,              # 15m water depth
    z=0.5,                   # Sensors 0.5m above seabed
    method='imlm',
    dres=180,                # 2-degree directional resolution
    verbose=1,
)

# Print summary statistics
print(f"Number of spectra: {len(result.time)}")
print(f"Hsig range: {result.hsig.min().values:.2f} - {result.hsig.max().values:.2f} m")
print(f"Tp range: {result.tp.min().values:.1f} - {result.tp.max().values:.1f} s")

# Save results
result.to_netcdf('wave_spectra.nc')

Pressure Gauge Array

import pandas as pd
from diwasp import diwasp

# Load data from 3 pressure gauges
df = pd.read_csv('pressure_array.csv', index_col='time', parse_dates=True)

# Define sensor positions (triangular array)
result = diwasp(
    df,
    sensor_mapping={
        'p1': 'pres',
        'p2': 'pres',
        'p3': 'pres',
    },
    window_length=1800,
    window_overlap=900,
    depth=10.0,
    x={'p1': 0, 'p2': 5, 'p3': -5},
    y={'p1': 0, 'p2': 5, 'p3': 5},
    z=0.0,  # All on seabed
    method='emep',
)

Integration with wavespectra

from diwasp import diwasp
import wavespectra

# Run DIWASP analysis
result = diwasp(df, sensor_mapping=mapping, ...)

# Use wavespectra methods
hs = result.spec.hs()  # Significant wave height
tp = result.spec.tp()  # Peak period

# Plot using wavespectra
result.spec.plot()

Input Validation

The function validates inputs and will raise errors for:

Empty Sensor Mapping

# This will raise ValueError
result = diwasp(df, sensor_mapping={}, ...)

Error: ValueError: sensor_mapping cannot be empty

Invalid Depth

# This will raise ValueError
result = diwasp(df, sensor_mapping=mapping, depth=-10.0, ...)

Error: ValueError: depth must be positive

Invalid Window Overlap

# This will raise ValueError
result = diwasp(
    df,
    sensor_mapping=mapping,
    window_length=300,
    window_overlap=300,  # Must be less than window_length
    depth=20.0,
)

Error: ValueError: window_overlap must be less than window_length

Troubleshooting

"DatetimeIndex" Error

ValueError: DataFrame index must be DatetimeIndex

Solution: Ensure your DataFrame has a datetime index:

df.index = pd.to_datetime(df.index)
# or
df = pd.read_csv('data.csv', index_col='time', parse_dates=True)

"Window exceeds data length" Error

Solution: Reduce window length or get more data:

# Check data length
print(f"Data duration: {(df.index[-1] - df.index[0]).total_seconds()} seconds")

# Use shorter windows
result = diwasp(df, ..., window_length=600, ...)  # 10 minutes

Poor Results

Check sensor positions are correct
Try a more robust method ('emep' or 'bdm')
Verify data quality (no gaps, correct units)
Ensure sufficient sensors for directional resolution

Slow Performance

Use faster method ('imlm' or 'emlm')
Reduce directional resolution (dres=90)
Limit frequency range with freqs parameter
Reduce nfft for faster FFT computation

FilesExpand file tree

wrapper.md

Latest commit

History