Quick Overview
This is UTDQuake in a nutsshell. This contains the main classes and functions to work with the UTDQuake dataset.
However, feel free to explore Core and Utils subpackages for more specialized functionalities.
UTDQuake Python Package
Provides convenient access to the UTDQuake seismic dataset. The package includes:
Dataset: Class for global dataset access (all networks, stations, events).
Network: Class for network-specific access, including EventBank, picks, and stations.
download_snapshot: Function to download UTDQuake data from Hugging Face.
load: Function to load datasets from Hugging Face by key.
Usage
>>> from utdquake import Dataset
>>> ds = Dataset()
>>> ds.stations.head()
>>> net = ds.get_network("tx")
>>> net.events.head()
- class utdquake.Dataset(das=False)[source]
Bases:
objectHigh-level interface for the UTDQuake dataset.
Provides access to networks, stations, events, and picks. Allows plotting and summary analysis of the dataset.
- Parameters:
das (bool, optional) – If True, use the DAS cache root environment variable (UTDQUAKE_DAS_ROOT). Otherwise use the standard cache root (UTDQUAKE_ROOT). Default is False.
- compute_stats(networks=None, use_cache=True, distance_bins=None, merge=False)[source]
Compute statistics for multiple networks and cache results.
- Parameters:
networks (list of str, optional) – List of network names. If None or ‘*’, use all local networks.
use_cache (bool) – Whether to use cached stats if available.
distance_bins (list, optional) – Distance bins for epicentral distance histograms.
merge (bool) – If True, merge stats across networks for combined analysis. Default: False.
- Returns:
all_stats – If merge=False: {network_name: stats_dict} If merge=True: single merged stats_dict suitable for plotting
- Return type:
dict
- property description: str
Return a summary of the dataset.
- Returns:
Summary of networks, stations, and events.
- Return type:
str
- download(networks, include_networks=True, include_events=False, include_stations=False, include_picks=False, include_banks=False, include_travel_time=False, unzip_banks=True, overwrite=False)[source]
Download selected data from the UTDQuake Hugging Face repository.
- Parameters:
networks (str or list of str) – Networks to download: - “*” downloads all networks - “t*” downloads all networks starting with ‘t’ - [“tx”, “uw”] downloads only specified networks
include_networks (bool, optional) – Whether to download the network metadata. Default: True.
include_events (bool, optional) – Whether to download the events data. Default: False.
include_stations (bool, optional) – Whether to download station metadata. Default: False.
include_picks (bool, optional) – Whether to download seismic picks. Default: False.
include_banks (bool, optional) – Whether to download the bank (synthetic) data. Default: False.
include_travel_time (bool, optional) – Whether to download the travel time models. Default: False.
overwrite (bool, optional) – If True, existing files will be re-downloaded. Default: False.
unzip_banks (bool, optional) – If True, downloaded bank zip files will be automatically extracted. Default: True.
- Returns:
Path to the local directory containing the downloaded snapshot.
- Return type:
Path
Notes
The function builds a set of allowed file patterns based on the requested networks and data types. Only files matching these patterns are downloaded. Zip files in banks are optionally unzipped.
Examples
>>> download(networks=["tx", "uw"], include_picks=True, overwrite=False)
- property events
Return all events as a Pandas DataFrame.
- get_stations(network='*', streaming=False, **kwargs)[source]
Return stations for a specific network.
- property local_networks
Return all local networks as a Pandas DataFrame.
- property networks
Return all networks as a Pandas DataFrame.
- plot_network_station_density(density_in_region=True, savepath=None, show=True)[source]
Plot station density maps for all networks.
Parameters:
- density_in_regionbool, optional
If True, calculate station density only within the network’s defined region. Default is True.
- savepathstr or None
Path to save the figure. If None, figure is not saved.
- showbool
Whether to display the figure.
- plot_overview(consider_calculated_stations=True, savepath=None, show=True)[source]
Plot a comprehensive overview of UTDQuake dataset.
Includes events, stations, and summary analysis.
- Parameters:
consider_calculated_stations (bool, optional) – If True, also plot calculated stations (if available). Defaults to True.
savepath (str or None) – Path to save the figure. If None, figure is not saved.
show (bool) – Whether to display the figure.
- plot_phase_count_radar_by_magnitude(savepath=None, show=True)[source]
Create radar plots of phase and station counts binned by magnitude.
For each magnitude range, the function displays the mean values of P phases, S phases, used phases, and station counts in a radar chart. Variability is represented using interquartile range (IQR) and the 10–90 percentile envelope.
Parameters:
- savepathstr or None
Path to save the figure. If None, figure is not saved.
- showbool
Whether to display the figure.
- plot_stats(savepath=None, show=True)[source]
Plot a 5-panel figure using precomputed network stats. See Network.compute_stats(merge=True) for expected stats structure.
Panels: depth, magnitude, epicentral distance, azimuthal gap, azimuth distribution.
- Parameters:
savepath (str, optional) – Path to save the figure. Default: None.
show (bool, optional) – If True, display the figure. Default: True.
- plot_travel_time_vs_distance(distance_unit='degrees', log_scale=False, savepath=None, show=True)[source]
Plot travel time versus distance for seismic picks.
Creates a scatter plot of travel time (y-axis) against distance (x-axis), with points colored by seismic phase type.
Parameters:
- distance_unitstr, optional
Unit for the x-axis distance. Options are: - “degrees” (default) - “km”
If “km” is selected, the distance in degrees is converted to kilometers using an approximate Earth conversion (1 degree ≈ 111.19 km).
- log_scalebool, optional
Whether to use logarithmic scale on the x-axis (default is False).
- showbool, optional
Whether to display the plot on screen (default is True).
- savepathstr or None, optional
If provided, the figure will be saved to this path with dpi=300 (default is None).
- property stations
Return all stations as a Pandas DataFrame.
- class utdquake.Network(name, das=False)[source]
Bases:
objectRepresents a single network in UTDQuake.
Provides access to network-specific events, stations, picks, EventBank, and plotting utilities.
- Parameters:
name (str) – Network name (e.g. “tx”, “uw”, “GCI”).
das (bool, optional) – Whether to use the DAS or not. If so, tou may define DAS cache root environment variable (UTDQUAKE_DAS_ROOT).
- property bank: obsplus.EventBank
Return the ObsPlus EventBank for this network.
- compute_stats(use_cache=True, distance_bins=None)[source]
Compute basic statistics for this network and cache results.
- Parameters:
use_cache (bool) – If True, load cached stats if available.
distance_column (str) – Column in picks to use for distance calculation. Can be ‘distance’ (deg) or ‘hyp_linear_distance’ (km).
distance_bins (list, optional) – Bin edges for epicentral distance histogram. Default: [0,30,60,100,150,200,300,np.inf].
- Returns:
stats – Dictionary with computed stats: - depth_values - magnitude_values, mag_min, mag_max - distance_bins, counts_P, counts_S - az_gap_hist - azimuth_hist
- Return type:
dict
- property description: str
Return a description dictionary of the network.
- Returns:
Keys include ‘events’, ‘total_stations’, and metadata fields.
- Return type:
dict
- property events: pandas.DataFrame
Return events DataFrame for this network.
- property picks: pandas.DataFrame
Return picks DataFrame for this network.
- plot_overview(consider_calculated_stations=True, is_alaska=False, savepath=None, show=True, **kwargs)[source]
Plot network map with events, stations, histograms, and region.
- Parameters:
consider_calculated_stations (bool, optional) – If True, also plot calculated stations (if available). Defaults to True.
is_alaska (bool, optional) – If True, use a projection suitable for Alaska. Default: True.
savepath (str or None) – Path to save the figure. If None, figure is not saved.
show (bool) – Whether to display the figure.
**kwargs – Additional keyword arguments passed directly to
utdquake.utils.plot.plot_overview().
- Return type:
Same as
utdquake.utils.plot.plot_overview().
- plot_phase_count_radar_by_magnitude(savepath=None, show=True, **kwargs)[source]
Create radar plots of phase and station counts binned by magnitude.
For each magnitude range, the function displays the mean values of P phases, S phases, used phases, and station counts in a radar chart. Variability is represented using interquartile range (IQR) and the 10–90 percentile envelope.
Parameters:
- savepathstr or None
Path to save the figure. If None, figure is not saved.
- showbool
Whether to display the figure.
- **kwargs
Additional keyword arguments passed directly to
utdquake.utils.plot.plot_phase_count_radar_by_magnitude().
Returns:
: Same as
utdquake.utils.plot.plot_phase_count_radar_by_magnitude().
- plot_pick_histograms(savepath=None, show=True, **kwargs)[source]
Plot histograms of P picks, S picks, and Vp/Vs ratio.
- Parameters:
savepath (str or None) – Path to save the figure. If None, figure is not saved.
show (bool) – Whether to display the figure.
**kwargs – Additional keyword arguments passed directly to
utdquake.utils.plot.plot_pick_histograms().
- Return type:
- plot_pick_stats(distance_type='epicentral', savepath=None, show=True, **kwargs)[source]
Plot summary statistics for seismic picks (P, S, S-P).
The function computes: - First and last P travel times per event. - First and last S travel times per event. - First and last S-P times for stations with both P and S picks. - Corresponding distances (either epicentral or hypocentral).
It creates individual seaborn jointplots (scatter + marginal histograms), saves them temporarily as PNGs, and combines them into a single multi-panel matplotlib figure.
- Parameters:
distance_type (str, default "epicentral") – Which distance to use: - “epicentral”: horizontal distance from epicenter. - “hypocentral”: approximate distance from hypocenter (linear approx.).
savepath (str or None) – Path to save the figure. If None, figure is not saved.
show (bool) – Whether to display the figure.
**kwargs – Additional keyword arguments passed directly to
utdquake.utils.plot.plot_pick_stats().
- Return type:
- plot_station_location_uncertainty(savepath=None, show=True, **kwargs)[source]
Compare confirmed vs calculated station locations.
- Parameters:
savepath (str or None) – Path to save the figure. If None, figure is not saved.
show (bool) – Whether to display the figure.
**kwargs – Additional keyword arguments passed directly to
utdquake.utils.plot.plot_station_location_uncertainty().
- Return type:
Same as
utdquake.utils.plot.plot_station_location_uncertainty().
- plot_stats(savepath=None, show=True, **kwargs)[source]
Create 5-panel seismic overview figure (depth, magnitude, distance, azimuth gap, azimuth distribution).
- Parameters:
savepath (str or None) – Path to save the figure. If None, figure is not saved.
show (bool) – Whether to display the figure.
**kwargs – Additional keyword arguments passed directly to
utdquake.utils.plot.plot_stats().
- Return type:
Same as
utdquake.utils.plot.plot_stats().
- plot_travel_time_qc(add_inset=True, zscore_threshold=2, show_text=True, show_models=None, show_global_model=False, distance_col='linear_hyp_distance', tt_col='travel_time', x_axins_limits=(0, 30), y_axins_limits=(0, 10), savepath=None, **kwargs)[source]
Plot multi-phase travel-time QC with optional inset zooms.
- Parameters:
df (pd.DataFrame) – Travel-time data with multiple phases.
add_inset (bool, optional) – Add zoomed inset plots (default: True).
zscore_threshold (float, optional) – Z-score threshold for outlier detection (default: 2).
show_text (bool, optional) – Show text annotations on each subplot (default: True).
show_models (list of str, optional) – Columns in model to plot (default: [“travel_time_p50”]).
show_global_model (bool, optional) – Display global trend bounds (default: False).
distance_col (str, optional) – Column name for distance (default: “linear_hyp_distance”).
tt_col (str, optional) – Column name for travel time (default: “travel_time”).
x_axins_limits (tuple, optional) – X-axis limits for inset (default: (0, 30)).
y_axins_limits (tuple, optional) – Y-axis limits for inset (default: (0, 10)).
savepath (str, optional) – Path to save figure (default: None).
**kwargs – Additional keyword arguments passed directly to
utdquake.utils.plot.plot_travel_time_qc().
- Return type:
- plot_travel_time_vs_distance(distance_unit='degrees', log_scale=False, savepath=None, show=True, **kwargs)[source]
Plot travel time versus distance for seismic picks.
Creates a scatter plot of travel time (y-axis) against distance (x-axis), with points colored by seismic phase type.
Parameters:
- distance_unitstr, optional
Unit for the x-axis distance. Options are: - “degrees” (default) - “km”
If “km” is selected, the distance in degrees is converted to kilometers using an approximate Earth conversion (1 degree ≈ 111.19 km).
- log_scalebool, optional
Whether to use logarithmic scale on the x-axis (default is False).
- showbool, optional
Whether to display the plot on screen (default is True).
- savepathstr or None, optional
If provided, the figure will be saved to this path with dpi=300 (default is None).
- **kwargs
Additional keyword arguments passed directly to
utdquake.utils.plot.plot_travel_time_vs_distance().
- rtype:
- plot_travel_time_vs_distance_zscore(phase='P', distance_unit='hypo_km', savepath=None, x_lim=(0, 300), y_lim=(0, 50), **kwargs)[source]
Plot travel time versus distance colored by z-score values.
- Parameters:
picks (pandas.DataFrame) –
Input DataFrame containing seismic pick information.
Required columns depend on the selected
distance_unit:Common required columns:
travel_timephasetravel_time_zscore
Additional distance column:
distancefor"degrees"or"km"linear_hyp_distancefor"hypo_km"
phase (str or None, default=None) – Seismic phase to plot. If
None, all phases are included.distance_unit (str, default="degrees") –
Distance representation used for the x-axis.
Supported options are:
"degrees""km""hypo_km"
log_scale (bool, default=False) – If
True, apply logarithmic scaling to the x-axis.show (bool, default=True) – If
True, display the figure.savepath (str or None, default=None) – Output path used to save the generated figure.
point_size (int, default=5) – Marker size used in scatter plots.
zmax (float, default=3.0) –
Maximum z-score value used for color normalization.
Picks with absolute z-score values larger than
zmaxare classified as outliers.x_lim (tuple or None, default=None) – X-axis limits in the form
(xmin, xmax).y_lim (tuple or None, default=None) – Y-axis limits in the form
(ymin, ymax).add_inset (bool, default=True) – If
True, add a zoomed inset axis.x_axins_limits (tuple, default=(0, 30)) – X-axis limits for the inset axes.
y_axins_limits (tuple, default=(0, 10)) – Y-axis limits for the inset axes.
**kwargs – Additional keyword arguments passed directly to
utdquake.utils.plot.plot_travel_time_vs_distance_zscore().
- Return type:
Same as
utdquake.utils.plot.plot_travel_time_vs_distance_zscore().- Raises:
ValueError – If
picksis not a pandas DataFrame.ValueError – If required columns are missing.
ValueError – If no picks are available for the selected phase.
Notes
Absolute z-score values are used for thresholding.
Outliers are plotted in gray.
A colorbar indicates z-score magnitude.
Approximate conversion factor:
1 degree = 111.19 km.
- plot_uncertainty_boxplots(savepath=None, show=True, **kwargs)[source]
Plot horizontal/vertical uncertainty and standard error boxplots.
- Parameters:
savepath (str or None) – Path to save the figure. If None, figure is not saved.
show (bool) – Whether to display the figure.
**kwargs – Additional keyword arguments passed directly to
utdquake.utils.plot.plot_uncertainty_boxplots().
- Return type:
- property stations: pandas.DataFrame
Return stations DataFrame for this network.
- property travel_time: TravelTimeModel
Return the TravelTimeModel for this network.
- utdquake.download_snapshot(local_dir, networks, das=False, include_banks=True, include_networks=True, include_events=True, include_stations=True, include_picks=True, include_travel_time=False, overwrite=True, unzip_banks=True)[source]
Download selected data from the UTDQuake Hugging Face repository.
- Parameters:
local_dir (str or Path) – Local directory where the data will be downloaded. Created if it does not exist.
networks (str or list of str) – Networks to download: - “*” downloads all networks - “t*” downloads all networks starting with ‘t’ - [“tx”, “uw”] downloads only specified networks
das (bool, optional) – If True, downloads from the DAS dataset paths. Default: False (standard paths).
include_banks (bool, optional) – Whether to download the bank (synthetic) data. Default: True.
include_networks (bool, optional) – Whether to download the network metadata. Default: True.
include_events (bool, optional) – Whether to download the events data. Default: True.
include_stations (bool, optional) – Whether to download station metadata. Default: True.
include_picks (bool, optional) – Whether to download seismic picks. Default: True.
overwrite (bool, optional) – If True, existing files will be re-downloaded. Default: True.
unzip_banks (bool, optional) – If True, downloaded bank zip files will be automatically extracted. Default: True.
- Returns:
Path to the local directory containing the downloaded snapshot.
- Return type:
Path
Notes
The function builds a set of allowed file patterns based on the requested networks and data types. Only files matching these patterns are downloaded. Zip files in banks are optionally unzipped.
Examples
>>> download_snapshot("/tmp/utdquake", networks=["tx", "uw"], include_picks=False) >>> download_snapshot("/tmp/utdquake", networks="*", overwrite=False)
- utdquake.load(key, network=None, streaming=False, das=False, **kwargs)[source]
Load a dataset from the UTDQuake Hugging Face repository.
- Parameters:
key (str) – Dataset key. Must be one of “networks”, “stations”, “events”, “picks”.
network (str or list of str, optional) – Network code(s) to filter. Use “*” for all networks. Ignored if key == “networks”.
das (bool, optional) – if das is True, loads from the DAS cache root instead of the standard cache root.
streaming (bool, optional) – If True, loads dataset in streaming mode (lazy iteration).
**kwargs (dict) – Additional keyword arguments forwarded to datasets.load_dataset.
- Returns:
Loaded Hugging Face dataset. Type depends on streaming.
- Return type:
Dataset or IterableDataset
- Raises:
ValueError – If key is not one of the supported dataset types.
Notes
When key is “networks”, the network parameter is ignored.
For other keys, the network argument filters the files to load.
Supports both single network (str) and multiple networks (list of str).
Examples
>>> ds = load("stations", network="tx") >>> ds = load("events", network=["tx","uw"], streaming=True)