Quick Overview

This is UTDQuake in a nutsshell. This contains the main classes and functions to work with the UTDQuake dataset.

However, feel free to explore Core and Utils subpackages for more specialized functionalities.

UTDQuake Python Package

Provides convenient access to the UTDQuake seismic dataset. The package includes:

  • Dataset: Class for global dataset access (all networks, stations, events).

  • Network: Class for network-specific access, including EventBank, picks, and stations.

  • download_snapshot: Function to download UTDQuake data from Hugging Face.

  • load: Function to load datasets from Hugging Face by key.

Usage

>>> from utdquake import Dataset
>>> ds = Dataset()
>>> ds.stations.head()
>>> net = ds.get_network("tx")
>>> net.events.head()
class utdquake.Dataset(das=False)[source]

Bases: object

High-level interface for the UTDQuake dataset.

Provides access to networks, stations, events, and picks. Allows plotting and summary analysis of the dataset.

Parameters:

das (bool, optional) – If True, use the DAS cache root environment variable (UTDQUAKE_DAS_ROOT). Otherwise use the standard cache root (UTDQUAKE_ROOT). Default is False.

compute_stats(networks=None, use_cache=True, distance_bins=None, merge=False)[source]

Compute statistics for multiple networks and cache results.

Parameters:
  • networks (list of str, optional) – List of network names. If None or ‘*’, use all local networks.

  • use_cache (bool) – Whether to use cached stats if available.

  • distance_bins (list, optional) – Distance bins for epicentral distance histograms.

  • merge (bool) – If True, merge stats across networks for combined analysis. Default: False.

Returns:

all_stats – If merge=False: {network_name: stats_dict} If merge=True: single merged stats_dict suitable for plotting

Return type:

dict

property description: str

Return a summary of the dataset.

Returns:

Summary of networks, stations, and events.

Return type:

str

download(networks, include_networks=True, include_events=False, include_stations=False, include_picks=False, include_banks=False, include_travel_time=False, unzip_banks=True, overwrite=False)[source]

Download selected data from the UTDQuake Hugging Face repository.

Parameters:
  • networks (str or list of str) – Networks to download: - “*” downloads all networks - “t*” downloads all networks starting with ‘t’ - [“tx”, “uw”] downloads only specified networks

  • include_networks (bool, optional) – Whether to download the network metadata. Default: True.

  • include_events (bool, optional) – Whether to download the events data. Default: False.

  • include_stations (bool, optional) – Whether to download station metadata. Default: False.

  • include_picks (bool, optional) – Whether to download seismic picks. Default: False.

  • include_banks (bool, optional) – Whether to download the bank (synthetic) data. Default: False.

  • include_travel_time (bool, optional) – Whether to download the travel time models. Default: False.

  • overwrite (bool, optional) – If True, existing files will be re-downloaded. Default: False.

  • unzip_banks (bool, optional) – If True, downloaded bank zip files will be automatically extracted. Default: True.

Returns:

Path to the local directory containing the downloaded snapshot.

Return type:

Path

Notes

The function builds a set of allowed file patterns based on the requested networks and data types. Only files matching these patterns are downloaded. Zip files in banks are optionally unzipped.

Examples

>>> download(networks=["tx", "uw"], include_picks=True, overwrite=False)
property events

Return all events as a Pandas DataFrame.

get_events(network='*', streaming=False, **kwargs)[source]

Return events for a specific network.

get_network(name)[source]

Return a Network object for a given network name.

get_picks(network='*', streaming=True)[source]

Return picks for a specific network.

get_stations(network='*', streaming=False, **kwargs)[source]

Return stations for a specific network.

property local_networks

Return all local networks as a Pandas DataFrame.

property networks

Return all networks as a Pandas DataFrame.

plot_network_station_density(density_in_region=True, savepath=None, show=True)[source]

Plot station density maps for all networks.

Parameters:

density_in_regionbool, optional

If True, calculate station density only within the network’s defined region. Default is True.

savepathstr or None

Path to save the figure. If None, figure is not saved.

showbool

Whether to display the figure.

plot_overview(consider_calculated_stations=True, savepath=None, show=True)[source]

Plot a comprehensive overview of UTDQuake dataset.

Includes events, stations, and summary analysis.

Parameters:
  • consider_calculated_stations (bool, optional) – If True, also plot calculated stations (if available). Defaults to True.

  • savepath (str or None) – Path to save the figure. If None, figure is not saved.

  • show (bool) – Whether to display the figure.

plot_phase_count_radar_by_magnitude(savepath=None, show=True)[source]

Create radar plots of phase and station counts binned by magnitude.

For each magnitude range, the function displays the mean values of P phases, S phases, used phases, and station counts in a radar chart. Variability is represented using interquartile range (IQR) and the 10–90 percentile envelope.

Parameters:

savepathstr or None

Path to save the figure. If None, figure is not saved.

showbool

Whether to display the figure.

plot_stats(savepath=None, show=True)[source]

Plot a 5-panel figure using precomputed network stats. See Network.compute_stats(merge=True) for expected stats structure.

Panels: depth, magnitude, epicentral distance, azimuthal gap, azimuth distribution.

Parameters:
  • savepath (str, optional) – Path to save the figure. Default: None.

  • show (bool, optional) – If True, display the figure. Default: True.

plot_travel_time(networks=None, zscore_threshold=3, savepath=None, show=True)[source]
plot_travel_time_vs_distance(distance_unit='degrees', log_scale=False, savepath=None, show=True)[source]

Plot travel time versus distance for seismic picks.

Creates a scatter plot of travel time (y-axis) against distance (x-axis), with points colored by seismic phase type.

Parameters:

distance_unitstr, optional

Unit for the x-axis distance. Options are: - “degrees” (default) - “km”

If “km” is selected, the distance in degrees is converted to kilometers using an approximate Earth conversion (1 degree ≈ 111.19 km).

log_scalebool, optional

Whether to use logarithmic scale on the x-axis (default is False).

showbool, optional

Whether to display the plot on screen (default is True).

savepathstr or None, optional

If provided, the figure will be saved to this path with dpi=300 (default is None).

property stations

Return all stations as a Pandas DataFrame.

class utdquake.Network(name, das=False)[source]

Bases: object

Represents a single network in UTDQuake.

Provides access to network-specific events, stations, picks, EventBank, and plotting utilities.

Parameters:
  • name (str) – Network name (e.g. “tx”, “uw”, “GCI”).

  • das (bool, optional) – Whether to use the DAS or not. If so, tou may define DAS cache root environment variable (UTDQUAKE_DAS_ROOT).

property bank: obsplus.EventBank

Return the ObsPlus EventBank for this network.

compute_stats(use_cache=True, distance_bins=None)[source]

Compute basic statistics for this network and cache results.

Parameters:
  • use_cache (bool) – If True, load cached stats if available.

  • distance_column (str) – Column in picks to use for distance calculation. Can be ‘distance’ (deg) or ‘hyp_linear_distance’ (km).

  • distance_bins (list, optional) – Bin edges for epicentral distance histogram. Default: [0,30,60,100,150,200,300,np.inf].

Returns:

stats – Dictionary with computed stats: - depth_values - magnitude_values, mag_min, mag_max - distance_bins, counts_P, counts_S - az_gap_hist - azimuth_hist

Return type:

dict

property description: str

Return a description dictionary of the network.

Returns:

Keys include ‘events’, ‘total_stations’, and metadata fields.

Return type:

dict

property events: pandas.DataFrame

Return events DataFrame for this network.

property picks: pandas.DataFrame

Return picks DataFrame for this network.

plot_overview(consider_calculated_stations=True, is_alaska=False, savepath=None, show=True, **kwargs)[source]

Plot network map with events, stations, histograms, and region.

Parameters:
  • consider_calculated_stations (bool, optional) – If True, also plot calculated stations (if available). Defaults to True.

  • is_alaska (bool, optional) – If True, use a projection suitable for Alaska. Default: True.

  • savepath (str or None) – Path to save the figure. If None, figure is not saved.

  • show (bool) – Whether to display the figure.

  • **kwargs – Additional keyword arguments passed directly to utdquake.utils.plot.plot_overview().

Return type:

Same as utdquake.utils.plot.plot_overview().

plot_phase_count_radar_by_magnitude(savepath=None, show=True, **kwargs)[source]

Create radar plots of phase and station counts binned by magnitude.

For each magnitude range, the function displays the mean values of P phases, S phases, used phases, and station counts in a radar chart. Variability is represented using interquartile range (IQR) and the 10–90 percentile envelope.

Parameters:

savepathstr or None

Path to save the figure. If None, figure is not saved.

showbool

Whether to display the figure.

**kwargs

Additional keyword arguments passed directly to utdquake.utils.plot.plot_phase_count_radar_by_magnitude().

Returns:

: Same as utdquake.utils.plot.plot_phase_count_radar_by_magnitude().

plot_pick_histograms(savepath=None, show=True, **kwargs)[source]

Plot histograms of P picks, S picks, and Vp/Vs ratio.

Parameters:
  • savepath (str or None) – Path to save the figure. If None, figure is not saved.

  • show (bool) – Whether to display the figure.

  • **kwargs – Additional keyword arguments passed directly to utdquake.utils.plot.plot_pick_histograms().

Return type:

Same as utdquake.utils.plot.plot_pick_histograms().

plot_pick_stats(distance_type='epicentral', savepath=None, show=True, **kwargs)[source]

Plot summary statistics for seismic picks (P, S, S-P).

The function computes: - First and last P travel times per event. - First and last S travel times per event. - First and last S-P times for stations with both P and S picks. - Corresponding distances (either epicentral or hypocentral).

It creates individual seaborn jointplots (scatter + marginal histograms), saves them temporarily as PNGs, and combines them into a single multi-panel matplotlib figure.

Parameters:
  • distance_type (str, default "epicentral") – Which distance to use: - “epicentral”: horizontal distance from epicenter. - “hypocentral”: approximate distance from hypocenter (linear approx.).

  • savepath (str or None) – Path to save the figure. If None, figure is not saved.

  • show (bool) – Whether to display the figure.

  • **kwargs – Additional keyword arguments passed directly to utdquake.utils.plot.plot_pick_stats().

Return type:

Same as utdquake.utils.plot.plot_pick_stats().

plot_station_location_uncertainty(savepath=None, show=True, **kwargs)[source]

Compare confirmed vs calculated station locations.

Parameters:
Return type:

Same as utdquake.utils.plot.plot_station_location_uncertainty().

plot_stats(savepath=None, show=True, **kwargs)[source]

Create 5-panel seismic overview figure (depth, magnitude, distance, azimuth gap, azimuth distribution).

Parameters:
  • savepath (str or None) – Path to save the figure. If None, figure is not saved.

  • show (bool) – Whether to display the figure.

  • **kwargs – Additional keyword arguments passed directly to utdquake.utils.plot.plot_stats().

Return type:

Same as utdquake.utils.plot.plot_stats().

plot_travel_time_qc(add_inset=True, zscore_threshold=2, show_text=True, show_models=None, show_global_model=False, distance_col='linear_hyp_distance', tt_col='travel_time', x_axins_limits=(0, 30), y_axins_limits=(0, 10), savepath=None, **kwargs)[source]

Plot multi-phase travel-time QC with optional inset zooms.

Parameters:
  • df (pd.DataFrame) – Travel-time data with multiple phases.

  • add_inset (bool, optional) – Add zoomed inset plots (default: True).

  • zscore_threshold (float, optional) – Z-score threshold for outlier detection (default: 2).

  • show_text (bool, optional) – Show text annotations on each subplot (default: True).

  • show_models (list of str, optional) – Columns in model to plot (default: [“travel_time_p50”]).

  • show_global_model (bool, optional) – Display global trend bounds (default: False).

  • distance_col (str, optional) – Column name for distance (default: “linear_hyp_distance”).

  • tt_col (str, optional) – Column name for travel time (default: “travel_time”).

  • x_axins_limits (tuple, optional) – X-axis limits for inset (default: (0, 30)).

  • y_axins_limits (tuple, optional) – Y-axis limits for inset (default: (0, 10)).

  • savepath (str, optional) – Path to save figure (default: None).

  • **kwargs – Additional keyword arguments passed directly to utdquake.utils.plot.plot_travel_time_qc().

Return type:

Same as utdquake.utils.plot.plot_travel_time_qc().

plot_travel_time_vs_distance(distance_unit='degrees', log_scale=False, savepath=None, show=True, **kwargs)[source]

Plot travel time versus distance for seismic picks.

Creates a scatter plot of travel time (y-axis) against distance (x-axis), with points colored by seismic phase type.

Parameters:

distance_unitstr, optional

Unit for the x-axis distance. Options are: - “degrees” (default) - “km”

If “km” is selected, the distance in degrees is converted to kilometers using an approximate Earth conversion (1 degree ≈ 111.19 km).

log_scalebool, optional

Whether to use logarithmic scale on the x-axis (default is False).

showbool, optional

Whether to display the plot on screen (default is True).

savepathstr or None, optional

If provided, the figure will be saved to this path with dpi=300 (default is None).

**kwargs

Additional keyword arguments passed directly to utdquake.utils.plot.plot_travel_time_vs_distance().

rtype:

Same as utdquake.utils.plot.plot_travel_time_vs_distance().

plot_travel_time_vs_distance_zscore(phase='P', distance_unit='hypo_km', savepath=None, x_lim=(0, 300), y_lim=(0, 50), **kwargs)[source]

Plot travel time versus distance colored by z-score values.

Parameters:
  • picks (pandas.DataFrame) –

    Input DataFrame containing seismic pick information.

    Required columns depend on the selected distance_unit:

    Common required columns:

    • travel_time

    • phase

    • travel_time_zscore

    Additional distance column:

    • distance for "degrees" or "km"

    • linear_hyp_distance for "hypo_km"

  • phase (str or None, default=None) – Seismic phase to plot. If None, all phases are included.

  • distance_unit (str, default="degrees") –

    Distance representation used for the x-axis.

    Supported options are:

    • "degrees"

    • "km"

    • "hypo_km"

  • log_scale (bool, default=False) – If True, apply logarithmic scaling to the x-axis.

  • show (bool, default=True) – If True, display the figure.

  • savepath (str or None, default=None) – Output path used to save the generated figure.

  • point_size (int, default=5) – Marker size used in scatter plots.

  • zmax (float, default=3.0) –

    Maximum z-score value used for color normalization.

    Picks with absolute z-score values larger than zmax are classified as outliers.

  • x_lim (tuple or None, default=None) – X-axis limits in the form (xmin, xmax).

  • y_lim (tuple or None, default=None) – Y-axis limits in the form (ymin, ymax).

  • add_inset (bool, default=True) – If True, add a zoomed inset axis.

  • x_axins_limits (tuple, default=(0, 30)) – X-axis limits for the inset axes.

  • y_axins_limits (tuple, default=(0, 10)) – Y-axis limits for the inset axes.

  • **kwargs – Additional keyword arguments passed directly to utdquake.utils.plot.plot_travel_time_vs_distance_zscore().

Return type:

Same as utdquake.utils.plot.plot_travel_time_vs_distance_zscore().

Raises:
  • ValueError – If picks is not a pandas DataFrame.

  • ValueError – If required columns are missing.

  • ValueError – If no picks are available for the selected phase.

Notes

  • Absolute z-score values are used for thresholding.

  • Outliers are plotted in gray.

  • A colorbar indicates z-score magnitude.

  • Approximate conversion factor:

1 degree = 111.19 km.

plot_uncertainty_boxplots(savepath=None, show=True, **kwargs)[source]

Plot horizontal/vertical uncertainty and standard error boxplots.

Parameters:
  • savepath (str or None) – Path to save the figure. If None, figure is not saved.

  • show (bool) – Whether to display the figure.

  • **kwargs – Additional keyword arguments passed directly to utdquake.utils.plot.plot_uncertainty_boxplots().

Return type:

Same as utdquake.utils.plot.plot_uncertainty_boxplots().

property stations: pandas.DataFrame

Return stations DataFrame for this network.

property travel_time: TravelTimeModel

Return the TravelTimeModel for this network.

utdquake.download_snapshot(local_dir, networks, das=False, include_banks=True, include_networks=True, include_events=True, include_stations=True, include_picks=True, include_travel_time=False, overwrite=True, unzip_banks=True)[source]

Download selected data from the UTDQuake Hugging Face repository.

Parameters:
  • local_dir (str or Path) – Local directory where the data will be downloaded. Created if it does not exist.

  • networks (str or list of str) – Networks to download: - “*” downloads all networks - “t*” downloads all networks starting with ‘t’ - [“tx”, “uw”] downloads only specified networks

  • das (bool, optional) – If True, downloads from the DAS dataset paths. Default: False (standard paths).

  • include_banks (bool, optional) – Whether to download the bank (synthetic) data. Default: True.

  • include_networks (bool, optional) – Whether to download the network metadata. Default: True.

  • include_events (bool, optional) – Whether to download the events data. Default: True.

  • include_stations (bool, optional) – Whether to download station metadata. Default: True.

  • include_picks (bool, optional) – Whether to download seismic picks. Default: True.

  • overwrite (bool, optional) – If True, existing files will be re-downloaded. Default: True.

  • unzip_banks (bool, optional) – If True, downloaded bank zip files will be automatically extracted. Default: True.

Returns:

Path to the local directory containing the downloaded snapshot.

Return type:

Path

Notes

The function builds a set of allowed file patterns based on the requested networks and data types. Only files matching these patterns are downloaded. Zip files in banks are optionally unzipped.

Examples

>>> download_snapshot("/tmp/utdquake", networks=["tx", "uw"], include_picks=False)
>>> download_snapshot("/tmp/utdquake", networks="*", overwrite=False)
utdquake.load(key, network=None, streaming=False, das=False, **kwargs)[source]

Load a dataset from the UTDQuake Hugging Face repository.

Parameters:
  • key (str) – Dataset key. Must be one of “networks”, “stations”, “events”, “picks”.

  • network (str or list of str, optional) – Network code(s) to filter. Use “*” for all networks. Ignored if key == “networks”.

  • das (bool, optional) – if das is True, loads from the DAS cache root instead of the standard cache root.

  • streaming (bool, optional) – If True, loads dataset in streaming mode (lazy iteration).

  • **kwargs (dict) – Additional keyword arguments forwarded to datasets.load_dataset.

Returns:

Loaded Hugging Face dataset. Type depends on streaming.

Return type:

Dataset or IterableDataset

Raises:

ValueError – If key is not one of the supported dataset types.

Notes

  • When key is “networks”, the network parameter is ignored.

  • For other keys, the network argument filters the files to load.

  • Supports both single network (str) and multiple networks (list of str).

Examples

>>> ds = load("stations", network="tx")
>>> ds = load("events", network=["tx","uw"], streaming=True)