Title: | Flood Frequency Analysis Framework |
Description: | Tools to support systematic and reproducible workflows for both stationary and nonstationary flood frequency analysis, with applications extending to other hydroclimate extremes, such as precipitation frequency analysis. This package implements the FFA framework proposed by Vidrio- Sahagún et al. (2024) <doi:10.1016/j.envsoft.2024.105940>, originally developed in MATLAB, now adapted for the R environment. This work was funded by the Flood Hazard Identification and Mapping Program of Environment and Climate Change Canada, as well as the Canada Research Chair (Tier 1) awarded to Dr. Pietroniro. |
Version: | 0.1.0 |
URL: | https://rileywheadon.github.io/ffa-framework/ |
BugReports: | https://github.com/rileywheadon/ffa-framework/issues |
License: | AGPL (≥ 3) |
Encoding: | UTF-8 |
LazyData: | true |
Depends: | R (≥ 4.4.0) |
Imports: | ggplot2, patchwork, httr, stats, glue, jsonlite, base64enc |
Suggests: | knitr, rmarkdown, spelling, testthat, vdiffr |
VignetteBuilder: | knitr |
RoxygenNote: | 7.3.2 |
Config/Needs/website: | rmarkdown |
Language: | en-US |
NeedsCompilation: | no |
Packaged: | 2025-08-20 01:37:46 UTC; rileywheadon |
Author: | Riley Wheadon [aut, cre], Cuauhtémoc Vidrio-Sahagún [aut], Alain Pietroniro [aut, fnd], Jianxun He [aut], Environment and Climate Change Canada (ECCC) [fnd] |
Maintainer: | Riley Wheadon <rileywheadon@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-08-25 08:20:02 UTC |
Flood Frequency Analysis Framework
Description
This package provides tools for stationary (S-FFA) and nonstationary (NS-FFA)
flood flood frequency analysis of annual maximum series data. High-level wrapper
functions with the framework_*
prefix orchestrate the EDA and/or FFA modules from
Vidrio-Sahagún et al. (2024) and generate
reports. Users who wish to develop customized workflows may use methods with
the following prefixes:
-
eda_*
: Explore annual maximum series data for evidence of nonstationarity to inform approach selection (S-FFA or NS-FFA):Detect statistically significant change points.
Detect statistically significant temporal trends in the mean and variability.
-
select_*
: Select a suitable probability distribution using the L-moments. -
fit_*
: Fit parameters given a distribution and approach (S-FFA or NS-FFA). -
uncertainty_*
: Quantify uncertainty by computing confidence intervals. -
model_assessment()
evaluates model performance for using a variety of metrics.
Additional utility functions for visualization and computation are also available:
-
data_*
methods load, transform, and decompose annual maximum series data. -
plot_*
methods produce diagnostic and summary plots. -
utils_*
methods implement distribution-specific computations.
Datasets from five hydrometric stations in Canada are provided as representative
use cases (other datasets in /inst/extdata
are for testing purposes only):
Athabasca River at Athabasca (CAN-07BE001): An unregulated station with no statistical evidence of trends or change points (S-FFA recommended).
Kootenai River at Porthill (CAN-08NH21): A regulated station with evidence of an abrupt change in mean in 1972 (piecewise NS-FFA recommended).
Bow River at Banff (CAN-05BB001). An unregulated station with statistical evidence of a trend in the mean (NS-FFA recommended).
Chilliwack River at Chilliwack Lake (CAN-08MH016): An unregulated station with statistical evidence of a linear trend in variability (NS-FFA recommended).
Okanagan River at Penticton (CAN-08NM050): A regulated station with statistical evidence of a linear trend in both the mean and variability (NS-FFA recommended).
This package assumes familiarity with statistical techniques used in FFA, including parameter estimation (e.g., L-moments and maximum likelihood), dataset decomposition, and uncertainty quantification (parametric bootstrap and profile likelihood). For an explanation of these methods, see the FFA Framework wiki. For examples, see the vignettes on exploratory data analysis and flood frequency analysis.
Author(s)
Maintainer: Riley Wheadon rileywheadon@gmail.com
Authors:
Cuauhtémoc Vidrio-Sahagún ct.vidrio-sahagun@usask.ca
Alain Pietroniro alain.pietroniro@ucalgary.ca [funder]
Jianxun He jianhe@ucalgary.ca
Other contributors:
Environment and Climate Change Canada (ECCC) [funder]
See Also
Useful links:
Report bugs at https://github.com/rileywheadon/ffa-framework/issues
CAN-05BB001
Description
A dataframe of annual maximum series observations for station 05BB001, BOW RIVER AT BANFF in Alberta, Canada.
Usage
CAN_05BB001
Format
A dataframe with 110 rows and 2 columns spanning the period 1909-2018.
Details
Variables:
-
max
: Numeric; the observed annual maximum series, in m^3
/s. -
year
: Integer; the corresponding years.
Additional Information
This is an unregulated station in the RHBN. Whitfield & Pomeroy (2016) found that floods may be caused by rain, snow, or a combination of both. Therefore, practitioners should be careful when interpreting the results of FFA. Minimal human intervention in the basin means there is little justification for change points. EDA finds evidence of a decreasing trend in the mean.
Source
Meteorological Service of Canada (MSC) GeoMet Platform
References
Whitfield P. H., and Pomeroy J. W. (2016) Changes to flood peaks of a mountain river: implications for analysis of the 2013 flood in the Upper Bow River, Canada, Hydrological Processes, 30: 4657–4673. doi:10.1002/hyp.10957.
CAN-07BE001
Description
A dataframe of annual maximum series observations for station 07BE001, ATHABASCA RIVER AT ATHABASCA in Alberta, Canada.
Usage
CAN_07BE001
Format
A dataframe with 108 rows and 2 columns spanning the period 1913-2020.
Details
Variables:
-
max
: Numeric; the observed annual maximum series, in m^3
/s. -
year
: Integer; the corresponding years.
Additional Information
This is an unregulated station that is not in the RHBN. Additionally,
The MKS/Pettitt tests find no evidence of change points at the 0.05 significance level.
Trend detection finds no evidence of trends in the mean or variability.
This dataset is an excellent introductory example to stationary FFA.
Source
Meteorological Service of Canada (MSC) GeoMet Platform
CAN-08MH016
Description
A dataframe of annual maximum series observations for station 08MH016, CHILLIWACK RIVER AT CHILLIWACK LAKE in British Columbia, Canada.
Usage
CAN_08MH016
Format
A dataframe with 95 rows and 2 columns spanning the period 1922-2016.
Details
Variables:
-
max
: Numeric; the observed annual maximum series, in m^3
/s. -
year
: Integer; the corresponding years.
Additional Information
This is an unregulated station in the RHBN. Additionally,
The MKS/Pettitt tests find no evidence of change points at the 0.05 significance level.
Trend detection finds evidence of an increasing trend in the variability.
Source
Meteorological Service of Canada (MSC) GeoMet Platform
CAN-08NH021
Description
A dataframe of annual maximum series observations for station 08NH021, KOOTENAI RIVER AT PORTHILL in British Columbia, Canada.
Usage
CAN_08NH021
Format
A dataframe with 91 rows and 2 columns spanning the period 1928-2018.
Details
Variables:
-
max
: Numeric; the observed annual maximum series, in m^3
/s. -
year
: Integer; the corresponding years.
Additional Information
This is a regulated station that is not in the RHBN. Additionally,
The Libby dam was constructed upstream of this station in 1972.
The Pettitt test finds evidence of a change point in 1972 at the 0.05 significance level.
The MKS test finds evidence of change points in 1960 & 1985 at the 0.05 significance level.
This dataset is an excellent introduction to change point detection and piecewise NS-FFA.
Source
Meteorological Service of Canada (MSC) GeoMet Platform
CAN-08NM050
Description
A dataframe of annual maximum series observations for station 08NM050, OKANAGAN RIVER AT PENTICTON in British Columbia, Canada.
Usage
CAN_08NM050
Format
A dataframe with 97 rows and 2 columns spanning the period 1921-2017.
Details
Variables:
-
max
: Numeric; the observed annual maximum series, in m^3
/s. -
year
: Integer; the corresponding years.
Additional Information
This is a regulated station that is not part of the RHBN. The Okanagan River upstream of the station has been regulated since 1914 due to the construction of the first dam, followed by a second dam in 1920, and a regulation system in the early 1950s, consisting of four dams and 38 km of engineered channel. Rapid human settlement, development, and agricultural activity have also occurred in the watershed.
This dataset exhibits serial correlation and trends in both the mean and variability. Since the station is heavily influenced by reservoir operations, this dataset is intended for teaching purposes—not for practical flood estimation.
Source
Meteorological Service of Canada (MSC) GeoMet Platform
Decompose Annual Maximum Series
Description
Decomposes a nonstationary annual maxima series to derive its stationary stochastic component, which can be used to identify a best-fit distribution using conventional stationary methods, like those based on L-moments. The decomposition procedure follows that proposed by Vidrio-Sahagún and He (2022), which relies on the statistical representation of nonstationary stochastic processes.
Usage
data_decomposition(data, ns_years, ns_structure)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
ns_years |
For NS-FFA only: Numeric vector of observation years corresponding
to |
ns_structure |
For NS-FFA only: Named list indicating which distribution parameters are modeled as nonstationary. Must contain two logical scalars:
|
Details
Internally, the function does the following:
If there is a trend in the location, fit Sen's trend estimator and subtract away the fitted trend.
If there is a trend in the scale, estimate the variability of the data with
data_mw_variability()
, fit Sen's trend estimator to the variability vector, and rescale the data to remove the trend.If necessary, shift the data so that its minimum is at least 1.
Value
Numeric vector of decomposed data.
References
Vidrio-Sahagún, C. T., and He, J. (2022). The decomposition-based nonstationary flood frequency analysis. Journal of Hydrology, 612 (September 2022), 128186. doi:10.1016/j.jhydrol.2022.128186
See Also
data_mw_variability()
, eda_sens_trend()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
ns_years <- seq(from = 1901, to = 2000)
ns_structure <- list(location = TRUE, scale = FALSE)
data_decomposition(data, ns_years, ns_structure)
Fetch Data from MSC GeoMet API
Description
Gets annual maximum series data for a hydrological monitoring station from the MSC GeoMet API.
Usage
data_geomet(station_id)
Arguments
station_id |
A character scalar containing the ID of a hydrological monitoring station. You can search for station IDs by name, province, drainage basin, and location here. |
Value
A dataframe with two columns:
-
max
: A float, the annual maximum series observation, in m^3
/s. -
year
: An integer, the corresponding year.
See Also
Examples
# Get data for the BOW RIVER AT BANFF (05BB001)
df <- data_geomet("05BB001")
Fetch Local Package Data
Description
Fetch annual maximum series data for a hydrological monitoring station from the package data directory.
Usage
data_local(csv_file)
Arguments
csv_file |
A character scalar containing the file name of a local dataset in
|
Value
A dataframe with two columns:
-
max
: A float, the annual maximum series observation, in m^3
/s. -
year
: An integer, the corresponding year.
See Also
Examples
# Get data for the BOW RIVER AT BANFF (05BB001)
df <- data_local("CAN-05BB001.csv")
Estimate Variance for Annual Maximum Series Data
Description
Generates a time series of standard deviations using a moving window algorithm,
which can be used to explore potential evidence of nonstationarity in the
variability of a dataset. It returns a list that pairs each window’s mean year with
its window standard deviation. The hyperparameters size
and step
control the
behaviour of the moving window. Following the simulation findings from Vidrio-Sahagún
and He (2022), the default window size and step are set to 10 and 5 years
respectively. However, these can be changed by the user.
Usage
data_mw_variability(data, years, size = 10L, step = 5L)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
years |
Numeric vector of observation years corresponding to |
size |
Integer scalar. The number of years in each moving window.
Must be a positive number less than or equal to |
step |
Integer scalar. The offset (in years) between successive moving windows. Must be a positive number (default is 5). |
Value
A list with two entries:
-
years
: Numeric vector containing the mean year within each window. -
std
: Numeric vector of standard deviations within each window.
References
Vidrio-Sahagún, C. T., and He, J. (2022). The decomposition-based nonstationary flood frequency analysis. Journal of Hydrology, 612 (September 2022), 128186. doi:10.1016/j.jhydrol.2022.128186
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
years <- seq(from = 1901, to = 2000)
data_mw_variability(data, years)
Perform Data Screening
Description
Checks for missing entries and generates a list of summary statistics about a dataset.
Usage
data_screening(data, years)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
years |
Numeric vector of observation years corresponding to |
Value
A list with seven entries:
-
years_min
: The minimum value in the 'years' argument. -
years_max
: The maximum value in the 'years' argument. -
data_min
: The minimum value in the 'data' argument. -
data_med
: The median value in the 'data' argument. -
data_max
: The maximum value in the 'data' argument. -
missing_years
: An integer vector of years with no data. -
missing_count
: The number of missing entries in the dataset.
Examples
data <- rnorm(n = 10, mean = 100, sd = 10)
years <- c(1900, 1902, 1903, 1904, 1905, 1907, 1909, 1911, 1912, 1914)
data_screening(data, years)
Block-Bootstrap Mann-Kendall Test for Trend Detection
Description
Performs a bootstrapped version of the Mann-Kendall trend test to adjust for serial correlation in annual maximum series data. The BBMK test uses Spearman’s serial correlation test to identify the least insignificant lag, then applies a shuffling procedure to obtain the empirical p-value and confidence bounds for the Mann-Kendall test statistic.
Usage
eda_bbmk_test(data, alpha = 0.05, samples = 10000L)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
alpha |
Numeric scalar in |
samples |
Integer scalar. The number of bootstrap samples. Default is 10000. |
Details
The block size for reshuffling is equal to the least_lag
as estimated in
eda_spearman_test()
. Bootstrap samples are generated by resampling blocks
of the original data without replacement. This procedure effectively removes
serial correlation from the data.
Value
A list containing the test results, including:
-
data
: Thedata
argument. -
alpha
: The significance level as specified in thealpha
argument. -
null_hypothesis
: A string describing the null hypothesis. -
alternative_hypothesis
: A string describing the alternative hypothesis. -
statistic
: The Mann-Kendall S-statistic computed on the original series. -
bootstrap
: Vector of bootstrapped Mann-Kendall test statistics. -
p_value
: Empirical two-sided p-value derived from the bootstrap distribution. -
bounds
: Empirical confidence interval bounds from the bootstrap distribution. -
reject
: IfTRUE
, the null hypothesis was rejected at significancealpha
.
References
Bayazit, M., 2015. Nonstationarity of hydrological records and recent trends in trend analysis: a state-of-the-art review. Environmental Processes 2 (3), 527–542. doi:10.1007/s40710-015-0081-7
Khaliq, M.N., Ouarda, T.B.M.J., Gachon, P., Sushama, L., St-Hilaire, A., 2009. Identification of hydrological trends in the presence of serial and cross correlations: a review of selected methods and their application to annual flow regimes of Canadian rivers. Journal Hydrolology 368 (1–4), 117–130. doi:10.1016/j.jhydrol.2009.01.035
Sonali, P., Nagesh Kumar, D., 2013. Review of trend detection methods and their application to detect temperature changes in India. Journal Hydrology 476, 212–227. doi:10.1016/j.jhydrol.2012.10.034
See Also
plot_bbmk_test()
, eda_mk_test()
, eda_spearman_test()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
eda_bbmk_test(data, samples = 1000L)
Kwiatkowski–Phillips–Schmidt–Shin (KPSS) Unit Root Test
Description
Performs the KPSS unit root test on annual maximum series data. The null hypothesis is that the time series is trend-stationary with a linear deterministic trend and constant drift. The alternative hypothesis is that the time series has a unit root (also known as a stochastic trend).
Usage
eda_kpss_test(data, alpha = 0.05)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
alpha |
Numeric scalar in |
Details
The implementation of the KPSS test is based on the aTSA package, which
interpolates a significance table from Hobijn et al. (2004). Therefore, a result
of p = 0.01
implies that p \leq 0.01
and a result of p = 0.10
implies that p \geq 0.10
.
Value
A list containing the test results, including:
-
data
: Thedata
argument. -
alpha
: The significance level as specified in thealpha
argument. -
null_hypothesis
: A string describing the null hypothesis. -
alternative_hypothesis
: A string describing the alternative hypothesis. -
statistic
: The KPSS test statistic. -
p_value
: The interpolated p-value. See the details for more information. -
reject
: IfTRUE
, the null hypothesis was rejected at significancealpha
.
References
Hobijn, B., Franses, P.H. and Ooms, M. (2004), Generalizations of the KPSS-test for stationarity. Statistica Neerlandica, 58: 483-502.
Kwiatkowski, D.; Phillips, P. C. B.; Schmidt, P.; Shin, Y. (1992). Testing the null hypothesis of stationarity against the alternative of a unit root. Journal of Econometrics, 54 (1-3): 159-178.
See Also
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
eda_kpss_test(data)
Mann–Kendall Trend Test
Description
Performs the Mann–Kendall trend test on a numeric vector to detect the presence of an increasing or decreasing monotonic trend over time. The test is nonparametric and accounts for tied observations in the data. The null hypothesis assumes there is no monotonic trend.
Usage
eda_mk_test(data, alpha = 0.05)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
alpha |
Numeric scalar in |
Details
The test statistic S
is the sum over all pairs i < j
of the sign
of the difference x_j - x_i
. Ties are explicitly accounted for when
calculating the variance of S
, using grouped frequencies of tied observations.
The test statistic Z
is then computed based on the sign and magnitude of
S
, and the p-value is derived from the standard normal distribution.
Value
A list containing the test results, including:
-
data
: Thedata
argument. -
alpha
: The significance level as specified in thealpha
argument. -
null_hypothesis
: A string describing the null hypothesis. -
alternative_hypothesis
: A string describing the alternative hypothesis. -
statistic
: The Mann–Kendall test statistic. -
variance
: The variance of the test statistic under the null hypothesis. -
p_value
: The p-value associated with the two-sided hypothesis test. -
reject
: Logical. IfTRUE
, the null hypothesis is rejected atalpha
.
References
Kendall, M. (1975). Rank Correlation Methods. Griffin, London, 202 pp.
Mann, H. B. (1945). Nonparametric Tests Against Trend. Econometrica, 13(3): 245-25
See Also
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
eda_mk_test(data)
Mann–Kendall–Sneyers Test for Change Point Detection
Description
Performs the Mann–Kendall–Sneyers (MKS) test to detect a trend change in annual maximum series data. The test computes normalized progressive and regressive Mann–Kendall statistics and identifies statistically significant crossing points, indicating potential change points. Under the null hypothesis, there are no trend changes.
Usage
eda_mks_test(data, years, alpha = 0.05)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
years |
Numeric vector of observation years corresponding to |
alpha |
Numeric scalar in |
Details
This function computes progressive and regressive Mann–Kendall-Sneyers statistics, normalized by their expected values and variances under the null hypothesis. The crossing points occur when the difference between the progressive and regressive statistics switches sign. The significance of detected crossing points is assessed using the quantiles of the normal distribution.
Value
A list containing the test results, including:
-
data
: Thedata
argument. -
years
: Theyears
argument. -
alpha
: The significance level as specified in thealpha
argument. -
null_hypothesis
: A string describing the null hypothesis. -
alternative_hypothesis
: A string describing the alternative hypothesis. -
progressive_series
: Normalized progressive Mann–Kendall-Sneyers statistics. -
regressive_series
: Normalized regressive Mann–Kendall-Sneyers statistics. -
bound
: Critical confidence bound for significance based onalpha
. -
change_points
: A dataframe of potential change points. -
p_value
: Two-sided p-value of the most significant crossing point. -
reject
: IfTRUE
, the null hypothesis was rejected at significancealpha
.
change_points
contains the years, test statistics, and p-values of each
potential change point. If no change points were identified, change_points
is empty.
References
Sneyers, R. (1990). On the statistical analysis of series of observations. Technical note No. 143, World Meteorological Organization, Geneva.
See Also
plot_mks_test()
, eda_pettitt_test()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
years <- seq(from = 1901, to = 2000)
eda_mks_test(data, years)
Pettitt Test for Abrupt Changes in the Mean of a Time Series
Description
Performs the nonparametric Pettitt test to detect a single abrupt change in the central tendency of a time series. Under the null hypothesis, there is no change.
Usage
eda_pettitt_test(data, years, alpha = 0.05)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
years |
Numeric vector of observation years corresponding to |
alpha |
Numeric scalar in |
Value
A list containing the test results, including:
-
data
: Thedata
argument. -
years
: Theyears
argument. -
alpha
: The significance level as specified in thealpha
argument. -
null_hypothesis
: A string describing the null hypothesis. -
alternative_hypothesis
: A string describing the alternative hypothesis. -
u_series
: Numeric vector of absolute U-statistics for all years. -
statistic
: The test statistic and maximum absolute U-statistic. -
bound
: The critical value of the test statistic based onalpha
. -
change_points
: A dataframe containing the potential change point. -
p_value
: An asymptotic approximation of the p-value for the test. -
reject
: Logical scalar. IfTRUE
, the null hypothesis was rejected.
change_points
contains the years, test statistics, and p-values of each
potential change point. If no change points were identified, change_points
is empty.
References
Pettitt, A.N., 1979. A Non-parametric Approach to the Change-point Problem. J. Royal Statistics Society 28 (2), 126–135. http://www.jstor.org/stable/2346729
See Also
plot_pettitt_test()
, eda_mks_test()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
years <- seq(from = 1901, to = 2000)
eda_pettitt_test(data, years)
Phillips–Perron Unit Root Test
Description
Applies the Phillips–Perron (PP) test to check for a unit root in annual maximum series data. The null hypothesis assumes the time series contains a unit root (also known as a stochastic trend). The alternative hypothesis is that the time series is trend-stationary with a deterministic linear trend.
Usage
eda_pp_test(data, alpha = 0.05)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
alpha |
Numeric scalar in |
Details
The implementation of this test is based on the aTSA package, which
interpolates p-values from the table of critical values presented in Fuller W.
A. (1996). The critical values are only available for \alpha \geq 0.01
.
Therefore, a reported p-value of 0.01 indicates that p \leq 0.01
.
Value
A list containing the test results, including:
-
data
: Thedata
argument. -
alpha
: The significance level as specified in thealpha
argument. -
null_hypothesis
: A string describing the null hypothesis. -
alternative_hypothesis
: A string describing the alternative hypothesis. -
statistic
: The PP test statistic. -
p_value
: Reported p-value from the test. See the details for more information. -
reject
: IfTRUE
, the null hypothesis was rejected at significancealpha
.
References
Fuller, W. A. (1976). Introduction to Statistical Time Series. New York: John Wiley and Sons
Phillips, P. C. B.; Perron, P. (1988). Testing for a Unit Root in Time Series Regression. Biometrika, 75 (2): 335-346
See Also
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
eda_pp_test(data)
Wald–Wolfowitz Runs Test for Randomness
Description
Applies the Wald–Wolfowitz runs test to a numeric vector in order to assess whether it behaves as a random sequence. The test statistic and p-value is computed using the number of runs (sequences of values above or below the median). Under the null hypothesis, the data is random. The runs test can be used to assess whether the residuals of a nonstationary model are random, indicating the suitability of the assumed nonstationary structure (e.g., linear).
Usage
eda_runs_test(values, years, alpha = 0.05)
Arguments
values |
A numeric vector of values to check for randomness. |
years |
Numeric vector of observation years corresponding to |
alpha |
Numeric scalar in |
Value
A list containing the test results, including:
-
values
: Thevalues
argument. -
years
: Theyears
argument. -
alpha
: The significance level as specified in thealpha
argument. -
null_hypothesis
: A string describing the null hypothesis. -
alternative_hypothesis
: A string describing the alternative hypothesis. -
n
: The length of the input vector after removing the median. -
runs
: The number of runs in the transformed sequence of residuals. -
statistic
: The runs test statistic, computed usingruns
andn
. -
p_value
: The p-value derived from the normally distributed test statistic. -
reject
: IfTRUE
, the null hypothesis was rejected at significancealpha
.
References
Wald, A. and Wolfowitz, J. (1940). On a test whether two samples are from the same population. Annals of Mathematical Statistics, 11(2), 147–162.
See Also
plot_runs_test()
, eda_sens_trend()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
years <- seq(from = 1901, to = 2000)
sens_trend <- eda_sens_trend(data, years)
eda_runs_test(sens_trend$residuals, years)
Sen's Trend Estimator
Description
Computes Sen's linear trend estimator for a univariate time series. The estimated
slope and y-intercept are given in terms of the data and the covariate (time),
which is derived from the years using the formula (\text{Years} - 1900) / 100
.
Usage
eda_sens_trend(data, years)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
years |
Numeric vector of observation years corresponding to |
Details
Sen's slope estimator is a robust, nonparametric trend estimator based on the
median of all pairwise slopes between data points. The corresponding intercept
is the median of each y_i - mx_i
where m
is the estimated slope.
Value
A list containing the estimated trend:
-
data
: Thedata
argument. -
years
: Theyears
argument. -
slope
: The estimated slope. -
intercept
: The estimated y-intercept. -
residuals
: Vector of differences between the predicted and observed values.
References
Sen, P.K. (1968). Estimates of the regression coefficient based on Kendall's tau. Journal of the American Statistical Association, 63(324), 1379–1389.
See Also
eda_runs_test()
, plot_ams_data()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
years <- seq(from = 1901, to = 2000)
eda_sens_trend(data, years)
Spearman Test for Autocorrelation
Description
Performs the Spearman serial correlation test on annual maximum series data to check for serial correlation at various lags. Reports the smallest lag where the serial correlation is not statistically significant at the given significance level (the least insignificant lag).
Usage
eda_spearman_test(data, alpha = 0.05)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
alpha |
Numeric scalar in |
Value
A list containing the test results, including:
-
data
: Thedata
argument. -
alpha
: The significance level as specified in thealpha
argument. -
null_hypothesis
: A string describing the null hypothesis. -
alternative_hypothesis
: A string describing the alternative hypothesis. -
rho
: Numeric vector of serial correlation estimates for lags1
ton-3
. -
least_lag
: The smallest lag at which the serial correlation is insignificant. -
significant
: Indicates whether the serial correlation is significant at each lag. -
reject
: IfTRUE
, the null hypothesis was rejected at significancealpha
.
See Also
stats::cor.test()
, eda_bbmk_test()
, plot_spearman_test()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
eda_spearman_test(data)
White Test for Heteroskedasticity
Description
Performs the White test for heteroskedasticity by regressing the squared residuals of a linear model on the original regressors and their squared terms. The null hypothesis is homoskedasticity.
Usage
eda_white_test(data, years, alpha = 0.05)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
years |
Numeric vector of observation years corresponding to |
alpha |
Numeric scalar in |
Details
The White test regresses the squared residuals from a primary linear model
lm(data ~ years)
against both the original regressor and its square.
The test statistic is calculated as nR^2
, where R^2
is the
coefficient of determination from the auxiliary regression and n
is
the number of elements in the time series. Under the null hypothesis, the
test statistic has the \chi^2
distribution with 2 degrees of freedom.
Value
A list containing the results of the White test:
-
data
: Thedata
argument. -
years
: Theyears
argument. -
alpha
: The significance level as specified in thealpha
argument. -
null_hypothesis
: A string describing the null hypothesis. -
alternative_hypothesis
: A string describing the alternative hypothesis. -
statistic
: White test statistic based on sample size andr_squared
. -
p_value
: The p-value derived from a Chi-squared distribution withdf = 2
. -
reject
: IfTRUE
, the null hypothesis was rejected at significancealpha
.
References
White, H. (1980). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica, 48(4), 817–838.
See Also
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
years <- seq(from = 1901, to = 2000)
eda_white_test(data, years)
Generalized Maximum Likelihood Parameter Estimation
Description
Estimates the parameters of the generalized extreme value (GEV) distribution by
maximizing the generalized log‐likelihood, which incorporates a Beta prior on the
shape parameter. Initial parameter estimates are obtained using the method of L‐moments
and optimization is performed via stats::nlminb()
with repeated perturbations if
needed.
For NS-FFA: To estimate parameters for a nonstationary model, include the
observation years (ns_years
) and the nonstationary model structure (ns_structure
).
Usage
fit_gmle(data, prior, ns_years = NULL, ns_structure = NULL)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
prior |
Numeric vector of length 2. Specifies the parameters of the
Beta prior for the shape parameter |
ns_years |
For NS-FFA only: Numeric vector of observation years corresponding
to |
ns_structure |
For NS-FFA only: Named list indicating which distribution parameters are modeled as nonstationary. Must contain two logical scalars:
|
Details
Calls
fit_lmoments()
on the data to obtain initial parameter estimates.Initializes trend parameters to zero if necessary.
Defines an objective function using
utils_generalized_likelihood()
.Runs
stats::nlminb()
with box constraints. Attempts minimization up to 100 times.
Value
A list containing the results of parameter estimation:
-
data
: Thedata
argument. -
prior
: Theprior
argument. -
ns_years
: Thens_years
argument, if given. -
ns_structure
: Thens_structure
argument, if given. -
method
:"GMLE"
. -
params
: Numeric vector of estimated parameters. -
mll
: The maximum value of the generalized log‐likelihood.
References
El Adlouni, S., Ouarda, T.B.M.J., Zhang, X., Roy, R., Bobee, B., 2007. Generalized maximum likelihood estimators for the nonstationary generalized extreme value model. Water Resources Research 43 (3), 1–13. doi:10.1029/2005WR004545
Martins, E. S., and Stedinger, J. R. (2000). Generalized maximum-likelihood generalized extreme-value quantile estimators for hydrologic data. Water Resources Research, 36(3), 737–744. doi:10.1029/1999WR900330
See Also
utils_generalized_likelihood()
, fit_lmoments()
, stats::nlminb()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
prior <- c(6, 9)
ns_years <- seq(from = 1901, to = 2000)
ns_structure <- list(location = TRUE, scale = FALSE)
fit_gmle(data, prior, ns_years, ns_structure)
L-Moments Parameter Estimation
Description
For S-FFA only: Estimates the parameters of a stationary probability model using the L-moments.
Usage
fit_lmoments(data, distribution)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
distribution |
A three-character code indicating the distribution family.
Must be |
Details
First, the sample L-moments of the data are computed using utils_sample_lmoments()
.
Then, formulas from Hosking (1997) are used to match the parameters to
the sample L-moments. The distributions "GNO"
, "PE3"
, and "LP3"
use a
rational approximation of the parameters since no closed-form expression is known.
Value
A list containing the results of parameter estimation:
-
data
: Thedata
argument. -
distribution
: Thedistribution
argument. -
method
:"L-moments"
. -
params
: Numeric vector of estimated parameters.
References
Hosking, J.R.M. & Wallis, J.R., 1997. Regional frequency analysis: an approach based on L-Moments. Cambridge University Press, New York, USA.
See Also
fit_lmoments_kappa()
, fit_mle()
, fit_gmle()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
fit_lmoments(data, "GUM")
L-Moments Parameter Estimation for the Kappa Distribution
Description
This function estimates the parameters of the four-parameter Kappa distribution using the method of L-moments. Since no closed-form solution for the parameters in terms of the L-moments is known, the parameters are estimated numerically using Newton-Raphson iteration.
Usage
fit_lmoments_kappa(data)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
Details
First, the sample L-moments of the data are computed using utils_sample_lmoments()
.
Then, the stats::optim()
function is used to determine the parameters
by minimizing the euclidian distance between the sample and theoretical L-moment
ratios. The implementation of this routine is based on the deprecated homtest
package, formerly available at https://CRAN.R-project.org/package=homtest.
Value
A list containing the results of parameter estimation:
-
data
: Thedata
argument. -
distribution
:"KAP"
. -
method
:"L-moments"
. -
params
: numeric vector of 4 parameters in the order location, scale, shape (2).
References
Hosking, J.R.M. & Wallis, J.R., 1997. Regional frequency analysis: an approach based on L-Moments. Cambridge University Press, New York, USA.
See Also
utils_sample_lmoments()
, fit_lmoments()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
fit_lmoments_kappa(data)
Maximum Likelihood Parameter Estimation
Description
Estimates the parameters of a probability distribution by maximizing the log‐likelihood.
Initial parameter estimates are obtained using the method of L‐moments and optimization
is performed via stats::nlminb()
with repeated perturbations if needed.
For NS-FFA: To estimate parameters for a nonstationary model, include the
observation years (ns_years
) and the nonstationary model structure
(ns_structure
).
Usage
fit_mle(data, distribution, ns_years = NULL, ns_structure = NULL)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
distribution |
A three-character code indicating the distribution family.
Must be |
ns_years |
For NS-FFA only: Numeric vector of observation years corresponding
to |
ns_structure |
For NS-FFA only: Named list indicating which distribution parameters are modeled as nonstationary. Must contain two logical scalars:
|
Details
Calls
fit_lmoments()
ondata
to obtain initial parameter estimates.Initializes trend parameters to zero if necessary.
For
WEI
models, sets the location parameter to zero to ensure support.Defines an objective function using
utils_log_likelihood()
.Runs
stats::nlminb()
with box constraints. Attempts minimization up to 100 times.
Value
A list containing the results of parameter estimation:
-
data
: Thedata
argument. -
distribution
: Thedistribution
argument. -
ns_years
: Thens_years
argument, if given. -
ns_structure
: Thens_structure
argument, if given. -
method
:"MLE"
. -
params
: Numeric vector of estimated parameters. -
mll
: The maximum value of the log‐likelihood.
See Also
utils_log_likelihood()
, fit_lmoments()
, stats::nlminb()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
ns_years <- seq(from = 1901, to = 2000)
ns_structure <- list(location = TRUE, scale = FALSE)
fit_mle(data, "GNO", ns_years, ns_structure)
Orchestrate Exploratory Data Analysis
Description
First, this method identifies change points in the original annual maximum series data. Then, the user is given the option to split the dataset into two or more homogenous subperiods (trend-free or with monotonic trends). Finally, this method performs a collection of statistical tests for identifying monotonic nonstationarity in the mean and variability of each subperiod (if the dataset was split) or of the entire dataset (if it was not split). The results of EDA can help guide FFA approach selection (stationary or nonstationary) and FFA model determination.
Usage
framework_eda(
data,
years,
ns_splits = NULL,
generate_report = TRUE,
report_path = NULL,
report_formats = "html",
...
)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
years |
Numeric vector of observation years corresponding to |
ns_splits |
An integer vector of years used to split the data into homogeneous
subperiods. For S-FFA, set to |
generate_report |
If |
report_path |
A character scalar, the file path for the generated report.
If |
report_formats |
A character vector specifying the output format for the
report. Supported values are |
... |
Additional arguments. See the "Optional Arguments" section for a complete list. |
Value
eda_recommendations
: A list containing the recommended FFA approach, split
point(s) and nonstationary structure(s) from EDA:
-
approach
: Either "S-FFA", "NS-FFA" (for a single homogeneous period), or "Piecewise NS-FFA" (for multiple homogeneous subperiods). -
ns_splits
: The split point(s) identified by the change point detection test with the the lowest statistically significant p-value, orNULL
if no such point exists. -
ns_structures
: A list of structure objects for each homogeneous subperiod. Each structure is a list with boolean itemslocation
andscale
, which represent a linear trend in the in the mean or variability of the data, respectively. If no trends were found in any homogeneous subperiod,ns_structures
will beNULL
.
submodule_results
: A list of lists of statistical tests. Each list contains:
-
name
: Either "Change Point Detection" or "Trend Detection". -
start
: The first year of the homogeneous subperiod. -
end
: The last year of the homogeneous subperiod. Additional items from the statistical tests within the submodule.
Optional Arguments
-
alpha
: The numeric significance level for all statistical tests (default is 0.05). -
bbmk_samples
: The number of samples used in the Block-Bootstrap Mann-Kendall (BBMK) test (default is 10000). Must be an integer. -
window_size
: The size of the window used to compute the variability series. -
window_step
: The number of years between successive moving windows. Used to compute the variability series.
See Also
eda_pettitt_test()
, eda_mks_test()
, eda_mk_test()
,
eda_spearman_test()
, eda_bbmk_test()
, eda_pp_test()
, eda_kpss_test()
,
eda_sens_trend()
, eda_runs_test()
, eda_white_test()
Orchestrate Flood Frequency Analysis
Description
Performs frequency analysis of annual maximum series data including distribution selection, parameter estimation, uncertainty quantification, and model assessment. Supports both stationary (S-FFA) or nonstationary (NS-FFA) flood frequency analysis.
Usage
framework_ffa(
data,
years,
ns_splits = NULL,
ns_structures = NULL,
generate_report = TRUE,
report_path = NULL,
report_formats = "html",
...
)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
years |
Numeric vector of observation years corresponding to |
ns_splits |
An integer vector of years used to split the data into homogeneous
subperiods. For S-FFA, set to |
ns_structures |
For S-FFA, set to |
generate_report |
If |
report_path |
A character scalar, the file path for the generated report.
If |
report_formats |
A character vector specifying the output format for the
report. Supported values are |
... |
Additional arguments. See the "Optional Arguments" section for a complete list. |
Value
modelling_assumptions
: A list describing the model(s) used for the analysis.
-
approach
: Either "S-FFA", "NS-FFA", or "Piecewise NS-FFA". -
ns_splits
: Thens_splits
argument, if given. -
ns_structures
: Thens_structures
argument, if given.
submodule_results
: A list of lists of containing the results of frequency analysis.
Each list contains:
-
name
: Either "Distribution Selection", "Parameter Estimation", "Uncertainty Quantification", or "Model Assessment". -
start
: The first year of the homogeneous subperiod. -
end
: The last year of the homogeneous subperiod. Additional items specific to the the submodule.
Optional Arguments
-
selection
: Distribution selection method (default is"L-distance"
). Must be one of"L-distance"
,"L-kurtosis"
or"Z-statistic"
. Alternatively, setselection
to a three-letter distribution code (e.g.,"GUM"
) to use a specific distribution. -
s_estimation
: Parameter estimation method for S-FFA (default is"L-moments"
). Must be"L-moments"
,"MLE"
, or"GMLE"
. Method"GMLE"
requiresselection = "GEV"
. -
ns_estimation
: Parameter estimation method for NS-FFA (default is"MLE"
). Must be"MLE"
or"GMLE"
. Method"GMLE"
requiresselection = "GEV"
. -
s_uncertainty
: Uncertainty quantification method for S-FFA (default is"Bootstrap"
). Must be one of"Bootstrap"
,"RFPL"
, orRFGPL"
. Using method"RFPL"
requiress_estimation = "MLE"
and method"RFGPL"
requiress_estimation = "GMLE"
. -
ns_uncertainty
: Uncertainty quantification method for NS-FFA (default is"RFPL"
). Must be one of"Bootstrap"
,"RFPL"
, orRFGPL"
. Using method"RFPL"
requiresns_estimation = "MLE"
and method"RFGPL"
requiresns_estimation = "GMLE"
. -
z_samples
: Integer number of bootstrap samples for selection method"Z-statistic"
(default is 10000). -
gev_prior
: Parameters for the prior distribution of the shape parameter of the GEV distribution (default is 6, 9). Used with estimation method"GMLE"
. -
return_periods
: Integer list of return periods (in years) for estimating return levels. Uses the 2, 5, 10, 20, 50, and 100 year return periods by default. -
ns_slices
: Integer vector of years at which to estimate the return levels for nonstationary models. Slices outside of the period will be ignored (default is 1925, 1975, 2025). -
bootstrap_samples
: Integer number of samples for uncertainty quantification method '"Bootstrap" (default is 10000). -
rfpl_tolerance
: Log-likelihood tolerance for uncertainty quantification method"RFPL"
(default is 0.01). -
pp_formula
: Plotting position formula for model assessment. Must be one of:"Weibull" (default):
i / (n + 1)
"Blom":
(i - 0.375) / (n + 0.25)
"Cunnane":
(i - 0.4) / (n + 0.2)
"Gringorten":
(i - 0.44) / (n + 0.12)
"Hazen":
(i - 0.5) / n
See Also
select_ldistance()
, select_lkurtosis()
, select_zstatistic()
,
fit_lmoments()
, fit_mle()
, fit_gmle()
, uncertainty_bootstrap()
,
uncertainty_rfpl()
, uncertainty_rfgpl()
, model_assessment()
Orchestrate the Full FFA Framework
Description
Runs the entire flood frequency analysis framework using the results of exploratory data analysis (EDA) to guide approach selection (stationary or nonstationary) and perform flood frequency analysis. Returns a comprehensive and reproducible summary of the results.
Usage
framework_full(
data,
years,
ns_splits = NULL,
ns_structures = NULL,
generate_report = TRUE,
report_path = NULL,
report_formats = "html",
...
)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
years |
Numeric vector of observation years corresponding to |
ns_splits |
An integer vector of years used to split the data into homogeneous
subperiods. For S-FFA, set to |
ns_structures |
For S-FFA, set to |
generate_report |
If |
report_path |
A character scalar, the file path for the generated report.
If |
report_formats |
A character vector specifying the output format for the
report. Supported values are |
... |
Additional arguments to be passed to the statistical tests and frequency
analysis functions. See the details of |
Value
eda_recommendations
: See framework_eda()
.
modelling_assumptions
: See framework_ffa()
.
submodule_results
: A list of lists of results. Each list contains:
-
name
: Either "Change Point Detection", "Trend Detection", "Distribution Selection", "Parameter Estimation", "Uncertainty Quantification", or "Model Assessment". -
start
: The first year of the homogeneous subperiod. -
end
: The last year of the homogeneous subperiod. Additional items specific to the the submodule.
See Also
framework_eda()
, framework_ffa()
Model Assessment
Description
Computes various metrics for assessing the quality of a fitted flood frequency model. Metrics include accuracy (residual statistics), fitting efficiency (information criteria), and uncertainty (coverage based metrics using confidence intervals).
For NS-FFA: The metrics are computed from the normalized empirical/theoretical quantiles, which are determined based on the selected distribution family. Therefore, metrics from stationary and nonstationary models should not be compared.
Usage
model_assessment(
data,
distribution,
method,
prior = NULL,
ns_years = NULL,
ns_structure = NULL,
alpha = 0.05,
pp_formula = "Weibull",
ci = NULL
)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
distribution |
A three-character code indicating the distribution family.
Must be |
method |
Character scalar specifying the estimation method.
Must be |
prior |
Numeric vector of length 2. Specifies the parameters of the
Beta prior for the shape parameter |
ns_years |
For NS-FFA only: Numeric vector of observation years corresponding
to |
ns_structure |
For NS-FFA only: Named list indicating which distribution parameters are modeled as nonstationary. Must contain two logical scalars:
|
alpha |
Numeric scalar in |
pp_formula |
Character string specifying the plotting position formula.
Must |
ci |
For S-FFA only. Dataframe containing return periods (in the column
|
Details
These metrics are are computed for all models:
-
AIC_MLL
: Akaike Information Criterion, computed using the maximum log-likelihood. -
BIC_MLL
: Bayesian Information Criterion, computed using the maximum log-likelihood. -
R2
: Coefficient of determination from linear regression of estimates vs. data. -
RMSE
: Root mean squared error of quantile estimates. -
bias
: Mean bias of quantile estimates. -
AIC_RMSE
: Akaike Information Criterion, computed using the RMSE. -
BIC_RMSE
: Bayesian Information Criterion, computed using the RMSE.
These metrics are only computed if the ci
argument is provided:
-
AW
: Average width of the confidence interval(s). -
POC
: Percent of observations covered by the confidence interval(s). -
CWI
: Confidence width index, a metric that combinesAW
andPOC
.
Value
List containing the results of model assessment:
-
data
: Thedata
argument. -
q_theoretical
: The theoretical return level estimates based on the plotting positions. Normalized for nonstationary models. -
q_empirical
: The empirical return levels. Normalized for nonstationary models. -
metrics
: A list of model assessment metrics (see details).
See Also
uncertainty_bootstrap()
, uncertainty_rfpl()
, uncertainty_rfgpl()
,
plot_sffa_assessment()
, plot_nsffa_assessment()
Examples
# Initialize example data and params
data <- rnorm(n = 100, mean = 100, sd = 10)
params <- c(100, 10)
# Perform uncertainty analysis
ci <- uncertainty_bootstrap(data, "NOR", "L-moments")$ci
# Run model assessment
model_assessment(data, "NOR", "L-moments", ci = ci)
Compute Location and Scale of Kappa Distribution
Description
Compute Location and Scale of Kappa Distribution
Usage
mu_sigma(l1, l2, k, h)
Parameter 'alpha'
Description
Parameter 'alpha'
Arguments
alpha |
Numeric scalar in |
Parameter 'data'
Description
Parameter 'data'
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
Parameter 'distribution'
Description
Parameter 'distribution'
Arguments
distribution |
A three-character code indicating the distribution family.
Must be |
Parameter 'generate_report'
Description
Parameter 'generate_report'
Arguments
generate_report |
If |
Parameter 'method'
Description
Parameter 'method'
Arguments
method |
Character scalar specifying the estimation method.
Must be |
Parameter 'ns_slice'
Description
Parameter 'ns_slice'
Arguments
ns_slice |
For NS-FFA only: Numeric scalar specifying the year at which to
evaluate the quantiles of a nonstationary probability distribution. |
Parameter 'ns_slices'
Description
Parameter 'ns_slices'
Arguments
ns_slices |
For NS-FFA only: Numeric vector specifying the years at which to
evaluate the return levels confidence intervals of a nonstationary probability
distribution. |
Parameter 'ns_splits'
Description
Parameter 'ns_splits'
Arguments
ns_splits |
An integer vector of years used to split the data into homogeneous
subperiods. For S-FFA, set to |
Parameter 'ns_structure'
Description
Parameter 'ns_structure'
Arguments
ns_structure |
For NS-FFA only: Named list indicating which distribution parameters are modeled as nonstationary. Must contain two logical scalars:
|
Parameter 'ns_structures'
Description
Parameter 'ns_structures'
Arguments
ns_structures |
For S-FFA, set to |
Parameter 'ns_years'
Description
Parameter 'ns_years'
Arguments
ns_years |
For NS-FFA only: Numeric vector of observation years corresponding
to |
Parameter 'p'
Description
Parameter 'p'
Arguments
p |
Numeric vector of probabilities between 0 and 1 with no missing values. |
Parameter 'params'
Description
Parameter 'params'
Arguments
params |
Numeric vector of distribution parameters, in the order (location,
scale, shape). The length must be between 2 and 5, depending on the specified
|
Parameter 'periods'
Description
Parameter 'periods'
Arguments
periods |
Numeric vector used to set the return periods for FFA. All entries must be greater than or equal to 1. |
Parameter 'prior'
Description
Parameter 'prior'
Arguments
prior |
Numeric vector of length 2. Specifies the parameters of the
Beta prior for the shape parameter |
Parameter 'q'
Description
Parameter 'q'
Arguments
q |
Numeric vector of quantiles with no missing values. |
Parameter 'report_formats'
Description
Parameter 'report_formats'
Arguments
report_formats |
A character vector specifying the output format for the
report. Supported values are |
Parameter 'report_path'
Description
Parameter 'report_path'
Arguments
report_path |
A character scalar, the file path for the generated report.
If |
Parameter 'samples'
Description
Parameter 'samples'
Arguments
samples |
Integer scalar. The number of bootstrap samples. Default is 10000. |
Parameter 'tolerance'
Description
Parameter 'tolerance'
Arguments
tolerance |
The log-likelihood tolerance for Regula-Falsi convergence (default is 0.01). |
Parameter 'years'
Description
Parameter 'years'
Arguments
years |
Numeric vector of observation years corresponding to |
Plot Annual Maximum Series Data
Description
Produces a scatterplot of annual maximum series data against time, optionally overlaid with the sample mean/variability or Sen's trend estimator of the mean/variability.
Usage
plot_ams_data(
data,
years,
plot_mean = "None",
plot_variability = "None",
show_line = TRUE,
...
)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
years |
Numeric vector of observation years corresponding to |
plot_mean |
If "None" (default), the mean will not be plotted. If |
plot_variability |
If "None" (default), the variability will not be plotted.
If |
show_line |
If |
... |
Optional named arguments: 'title', 'xlabel', and 'ylabel'. |
Value
ggplot
; a plot containing:
Gray points for each year’s annual maximum series value.
A gray line connecting the data if
show_line = TRUE
.A solid black line representing a constant mean, if
plot_mean == "Constant"
.A solid blue line representing a trend in the mean, if
plot_mean == "Trend"
.A dashed black line representing constant variability, if
plot_variability == "Constant"
.A dashed blue line representing a trend in variability, if
plot_variability == "Trend"
.
See Also
eda_sens_trend()
, data_mw_variability()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
years <- seq(from = 1901, to = 2000)
plot_ams_data(data, years, plot_mean = "Trend", plot_variability = "Constant")
Plot Block‐Bootstrap Mann–Kendall Test Results
Description
Generates a histogram of bootstrapped Mann–Kendall S‐statistics with vertical lines indicating the observed S‐statistic and confidence bounds.
Usage
plot_bbmk_test(results, ...)
Arguments
results |
List of BB‐MK test results generated by |
... |
Optional named arguments: 'title', 'xlabel', and 'ylabel'. |
Value
ggplot
; A plot containing:
A gray histogram of the distribution of bootstrapped S‐statistics.
A red vertical line at the lower and upper confidence bounds.
A black vertical line at the observed S‐statistic.
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
results <- eda_bbmk_test(data, samples = 1000L)
plot_bbmk_test(results)
Plot L-Moment Ratio Diagram
Description
Generates a plot of L-moment ratios with the L-skewness on the x-axis and L-kurtosis on the y-axis. Plots the sample and log-sample L-moment ratios alongside the theoretical L-moment ratios for a set of candidate distributions. Also includes a small inset around the L-moment ratios of the recommended distribution.
Usage
plot_lmom_diagram(results, ...)
Arguments
results |
List of distribution selection results generated by |
... |
Optional named arguments: 'title', 'xlabel', and 'ylabel'. |
Value
ggplot
; plot object containing the L-moment ratio diagram, with:
L-moment ratio curves for each 3-parameter distribution.
Points for the L-moment ratios of each 2-parameter distribution.
Sample and log-sample L-moment ratio
(t_3, t_4)
points.
See Also
select_ldistance()
, select_lkurtosis()
, select_zstatistic()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
results <- select_ldistance(data)
plot_lmom_diagram(results)
Plot Mann–Kendall–Sneyers (MKS) Test Results
Description
Constructs a two‐panel visualization of the MKS test. The upper panel plots the normalized progressive and regressive Mann–Kendall S‐statistics over time, with dashed confidence bounds and potential trend‐change points. The lower panel contains the annual maximum series data with the change points highlighted.
Usage
plot_mks_test(results, show_line = TRUE, ...)
Arguments
results |
A list generated by |
show_line |
If |
... |
Optional named arguments: 'title', 'top_xlabel', 'top_ylabel', 'bottom_xlabel' and 'bottom_ylabel'. |
Value
A patchwork
object with two ggplot2
panels stacked vertically.
See Also
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
years <- seq(from = 1901, to = 2000)
results <- eda_mks_test(data, years)
plot_mks_test(results)
Plot Model Assessment for NS-FFA
Description
Creates a normalized and detrended quantile–quantile plot (a worm plot) comparing empirical annual maximum series data to quantile estimates from a fitted, parametric, and nonstationary model. Confidence intervals are also provided. The worm plot can be interpreted using the following heuristics:
For a satisfactory fit, the worm (red data points) should be within the 95% confidence interval (dashed black lines).
If the worm (red points) is passes above/below the origin, the model mean is too small/large respectively.
If the worm has a clear positive/negative slope, the model variance is too small/large respectively.
If the worm has a U-shape or inverted U-shape, the model is too skew to the left/right respectively.
Usage
plot_nsffa_assessment(results, ...)
Arguments
results |
List; model assessment results generated by |
... |
Optional named arguments: 'title', 'xlabel', and 'ylabel'. |
Value
ggplot
; a plot containing:
A black line representing a model with no deviation from the empirical quantiles.
Red points denoting the estimated quantiles against the empirical quantiles.
A dashed black line showing the 95% confidence intervals of the residuals.
References
van Buuren, S., Fredriks, M. Worm plot: a simple diagnostic device for modelling growth reference curves. Statistics in Medicine 20 (8), 1259-1277. doi:10.1002/sim.746
Examples
# Initialize example data and params
data <- rnorm(n = 100, mean = 100, sd = 10) + seq(from = 1, to = 100)
years <- seq(from = 1901, to = 2000)
structure <- list(location = TRUE, scale = FALSE)
# Evaluate model diagnostics
results <- model_assessment(data, "NOR", "MLE", NULL, years, structure)
# Generate a model assessment plot
plot_nsffa_assessment(results)
Plot Estimated Return Levels for NS-FFA
Description
Generates a plot with effective return periods on the x-axis and effective return levels (annual maxima magnitudes) on the y-axis. Each slice is displayed in a distinct color. Confidence bounds are shown as semi-transparent ribbons, and the point estimates are overlaid as solid lines. Return periods have a logarithmic scale.
Usage
plot_nsffa_estimates(
results,
slices = c(1900, 1940, 1980, 2020),
periods = c(2, 5, 10, 20, 50, 100),
...
)
Arguments
results |
A fitted flood frequency model generated by |
slices |
Default time slices for plotting the return levels if confidence intervals are not provided. |
periods |
Numeric vector used to set the return periods for FFA. All entries must be greater than or equal to 1. |
... |
Optional named arguments: 'title', 'xlabel', and 'ylabel'. |
Value
ggplot
; a plot with one line and ribbon per slice.
Examples
# Fit a nonstationary model
data <- rnorm(n = 100, mean = 100, sd = 10)
years <- seq(from = 1901, to = 2000)
ns_structure <- list(location = TRUE, scale = FALSE)
results <- fit_mle(
data,
"GEV",
ns_years = years,
ns_structure = ns_structure
)
# Generate the plot
plot_nsffa_estimates(results)
Plot Fitted Probability Distributions for NS-FFA
Description
Generates a plot showing probability densities of a nonstationary model for selected time slices (left panel) and the data (right panel).
Usage
plot_nsffa_fit(
results,
slices = c(1925, 1950, 1975, 2000),
show_line = TRUE,
...
)
Arguments
results |
A fitted flood frequency model generated by |
slices |
Years at which to plot the nonstationary probability model. |
show_line |
If |
... |
Optional named arguments: 'title', 'xlabel', and 'ylabel'. |
Value
ggplot
; a plot showing:
The likelihood function of the distribution plotted vertically on the left panel.
The data, connected with a line if
show_line == TRUE
, on the right panel.
Examples
data <- rnorm(n = 100, mean = 100, sd = 10) + seq(1, 100)
years <- seq(from = 1901, to = 2000)
ns_structure <- list(location = TRUE, scale = FALSE)
results <- fit_mle(
data,
"GEV",
ns_years = years,
ns_structure = ns_structure
)
plot_nsffa_fit(results)
Plot Results from the Pettitt Change‐Point Test
Description
Creates a two‐panel visualization of the Mann–Whitney–Pettitt test. The
upper panel plots the Pettitt U_t
statistic over time along with the
significance threshold and potential change point. The lower panel displays
the annual maximum series data with an optional trend line, the period
mean(s), and potential change point(s).
Usage
plot_pettitt_test(results, show_line = TRUE, ...)
Arguments
results |
A list generated by |
show_line |
If |
... |
Optional named arguments: 'title', 'top_xlabel', 'top_ylabel', 'bottom_xlabel' and 'bottom_ylabel'. |
Value
A patchwork
object with two ggplot2
panels stacked vertically.
See Also
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
years <- seq(from = 1901, to = 2000)
results <- eda_pettitt_test(data, years)
plot_pettitt_test(results)
Plot Runs Test Results
Description
Generates a residual plot of Sen's estimator applied to annual maximum series data (or the variability of the data) with a horizontal dashed line at zero and an annotation indicating the p-value of the Runs test.
Usage
plot_runs_test(results, ...)
Arguments
results |
A list of runs test results generated by |
... |
Optional named arguments: 'title', 'xlabel', and 'ylabel'. |
Value
ggplot
; a plot containing:
Black points for the residual at each year.
A red dashed horizontal line at
y = 0
.A text annotation “Runs p-value: X.XXX” in the plot area.
See Also
Examples
# Initialize data and years
data <- rnorm(n = 100, mean = 100, sd = 10)
years <- seq(from = 1901, to = 2000)
# Generate the runs test plot
sens_trend <- eda_sens_trend(data, years)
results <- eda_runs_test(sens_trend$residuals, years)
plot_runs_test(results)
Plot Model Assessment for S-FFA
Description
Creates a quantile–quantile plot comparing observed annual maximum series data to quantile estimates from a fitted parametric model. The 1:1 line is drawn in black and the parametric model estimates are plotted as semi‐transparent red points.
Usage
plot_sffa_assessment(results, ...)
Arguments
results |
List; model assessment results generated by |
... |
Optional named arguments: 'title', 'xlabel', and 'ylabel'. |
Value
ggplot
; a plot containing:
A black line representing a model with no deviation from the empirical quantiles.
Red points denoting the estimated quantiles against the empirical quantiles.
Examples
# Initialize example data
data <- rnorm(n = 100, mean = 100, sd = 10)
# Evaluate model diagnostics
results <- model_assessment(data, "NOR", "L-moments")
# Generate a model assessment plot
plot_sffa_assessment(results)
Plot Estimated Return Levels for S-FFA
Description
Generates a plot with return periods on the x-axis and return levels (annual maxima magnitudes) on the y-axis for S-FFA. The confidence bound is shown as a semi-transparent ribbon, and the point estimates are overlaid as a solid line. Return periods are shown on a logarithmic scale.
Usage
plot_sffa_estimates(results, periods = c(2, 5, 10, 20, 50, 100), ...)
Arguments
results |
A fitted flood frequency model generated by |
periods |
Numeric vector used to set the return periods for FFA. All entries must be greater than or equal to 1. |
... |
Optional named arguments: 'title', 'xlabel', and 'ylabel'. |
Value
ggplot
; a plot showing:
A solid black line for the point estimates produced by the model.
A semi-transparent gray ribbon indicating the confidence interval, if given.
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
years <- seq(from = 1901, to = 2000)
results <- fit_lmoments(data, "WEI")
plot_sffa_estimates(results)
Plot Fitted Probability Distribution for S-FFA
Description
Generates a plot showing the probability density of a stationary model (left panel) and the data (right panel).
Usage
plot_sffa_fit(results, show_line = TRUE, ...)
Arguments
results |
A fitted flood frequency model generated by |
show_line |
If |
... |
Optional named arguments: 'title', 'xlabel', and 'ylabel'. |
Value
ggplot
; a plot showing:
The likelihood function of the distribution plotted vertically on the left panel.
The data, connected with a line if
show_line == TRUE
, on the right panel.
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
years <- seq(from = 1901, to = 2000)
results <- fit_lmoments(data, "WEI")
plot_sffa_fit(results)
Plot Spearman’s Rho Autocorrelation
Description
Visualizes Spearman’s rho serial correlation coefficients with shaded points indicating statistical significance.
Usage
plot_spearman_test(results, ...)
Arguments
results |
A list generated by |
... |
Optional named arguments: 'title', 'xlabel', and 'ylabel'. |
Value
ggplot
; a plot showing:
Vertical segments from
y=0
up to each\rho
value at its lag.Filled circles at each lag, filled black if serial correlation is detected.
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
results <- eda_spearman_test(data)
plot_spearman_test(results)
L-Distance Method for Distribution Selection
Description
Selects a distribution from a set of candidate distributions by minimizing the
Euclidean distance between the theoretical L-moment ratios (\tau_3, \tau_4)
and the sample L-moment ratios (t_3, t_4)
.
For NS-FFA: To select a distribution for a nonstationary model, include the
observation years (ns_years
) and the nonstationary model structure
(ns_structure
). Then, this method will detrend the original, nonstationary data
internally using the data_decomposition()
function prior to distribution selection.
Usage
select_ldistance(data, ns_years = NULL, ns_structure = NULL)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
ns_years |
For NS-FFA only: Numeric vector of observation years corresponding
to |
ns_structure |
For NS-FFA only: Named list indicating which distribution parameters are modeled as nonstationary. Must contain two logical scalars:
|
Details
For each candidate distribution, this method computes the Euclidean distance between
sample L-moment ratios (\tau_3
, \tau_4
) and the closest point on the
theoretical distribution's L-moment curve. For two-parameter distributions (Gumbel,
Normal, Log-Normal), the theoretical L-moment ratios are compared directly with
the sample L-moment ratios. The distribution with the minimum distance is selected.
If a distribution is fit to log-transformed data (Log-Normal or Log-Pearson Type
III), the L-moment ratios for the log-transformed sample are used instead.
Value
A list with the results of distribution selection:
-
method
:"L-distance"
. -
decomposed_data
: The detrended dataset used to compute the L-moments. For S-FFA, this is thedata
argument. For NS-FFA, it is output ofdata_decomposition()
. -
metrics
: A list of L-distance metrics for each candidate distribution. -
recommendation
: The name of the distribution with the smallest L-distance.
References
Hosking, J.R.M. & Wallis, J.R., 1997. Regional frequency analysis: an approach based on L-Moments. Cambridge University Press, New York, USA.
See Also
utils_sample_lmoments()
, select_lkurtosis()
, select_zstatistic()
,
plot_lmom_diagram()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
select_ldistance(data)
L-Kurtosis Method for Distribution Selection
Description
Selects a probability distribution by minimizing the absolute distance
between the theoretical L-kurtosis (\tau_4
) and the sample L-kurtosis
(t_4
). Only supports 3-parameter distributions.
For NS-FFA: To select a distribution for a nonstationary model, include the
observation years (ns_years
) and the nonstationary model structure
(ns_structure
). Then, this method will detrend the original, nonstationary data
internally using the data_decomposition()
function prior to distribution selection.
Usage
select_lkurtosis(data, ns_years = NULL, ns_structure = NULL)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
ns_years |
For NS-FFA only: Numeric vector of observation years corresponding
to |
ns_structure |
For NS-FFA only: Named list indicating which distribution parameters are modeled as nonstationary. Must contain two logical scalars:
|
Details
This method computes the distance between the sample and theoretical L-kurtosis
values at a fixed L-skewness. For three parameter distributions, the shape parameter
that best replicates the sample L-skewness is determined using stats::optim()
.
Value
A list with the results of distribution selection:
-
method
:"L-kurtosis"
. -
decomposed_data
: The detrended dataset used to compute the L-moments. For S-FFA, this is thedata
argument. For NS-FFA, it is output ofdata_decomposition()
. -
metrics
: A list of L-kurtosis metrics for each distribution. -
recommendation
: Name of the distribution with the smallest L-kurtosis metric.
References
Hosking, J.R.M. & Wallis, J.R., 1997. Regional frequency analysis: an approach based on L-Moments. Cambridge University Press, New York, USA.
See Also
utils_sample_lmoments()
, select_ldistance()
, select_zstatistic()
,
plot_lmom_diagram()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
select_lkurtosis(data)
Z-Statistic Method for Distribution Selection
Description
Selects the best-fit distribution by comparing a bias-corrected Z-statistic for
the sample L-kurtosis (\tau_4
) against the theoretical L-moments for a set
of candidate distributions. The distribution with the smallest absolute Z-statistic
is selected.
For NS-FFA: To select a distribution for a nonstationary model, include the
observation years (ns_years
) and the nonstationary model structure
(ns_structure
). Then, this method will detrend the original, nonstationary data
internally using the data_decomposition()
function prior to distribution selection.
Usage
select_zstatistic(data, ns_years = NULL, ns_structure = NULL, samples = 10000L)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
ns_years |
For NS-FFA only: Numeric vector of observation years corresponding
to |
ns_structure |
For NS-FFA only: Named list indicating which distribution parameters are modeled as nonstationary. Must contain two logical scalars:
|
samples |
Integer scalar. The number of bootstrap samples. Default is 10000. |
Details
First, this method fits a four-parameter Kappa distribution to both the original and log-transformed data. Then, bootstrapping is used to estimate the bias and variance of the L-kurtosis. These values, along with the difference between the sample and theoretical L-kurtosis, are used to compute the Z-statistic for each distribution.
Value
A list with the results of distribution selection:
-
method
:"Z-selection"
. -
decomposed_data
: The detrended dataset used to compute the L-moments. For S-FFA, this is thedata
argument. For NS-FFA, it is output ofdata_decomposition()
. -
metrics
: List of computed Z-statistics for each candidate distribution. -
recommendation
: Name of the distribution with the smallest Z-statistic. -
reg_params
: Kappa distribution parameters for the data. -
reg_bias_t4
: Bias of the L-kurtosis from the bootstrap. -
reg_std_t4
: Standard deviation of the L-kurtosis from the bootstrap. -
log_params
: Kappa distribution parameters for the log-transformed data. -
log_bias_t4
: Bias of the L-kurtosis from the bootstrap usinglog_params
. -
log_std_t4
: Standard deviation of the L-kurtosis from the bootstrap usinglog_params
.
References
Hosking, J.R.M. & Wallis, J.R., 1997. Regional frequency analysis: an approach based on L-Moments. Cambridge University Press, New York, USA.
See Also
select_ldistance()
, select_lkurtosis()
, fit_lmoments_kappa()
,
utils_quantiles()
, plot_lmom_diagram()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
select_zstatistic(data)
Compute L-moment Distance for Kappa Distribution
Description
Compute L-moment Distance for Kappa Distribution
Usage
sumquad_tau3tau4(k.h, t3.t4)
Parametric Bootstrap Uncertainty Quantification
Description
Computes return level estimates and confidence intervals at the specified return periods (defaults to 2, 5, 10, 20, 50, and 100 years) using the parametric bootstrap. This function supports many probability models and parameter estimation methods.
For NS-FFA: To perform uncertainty quantification for a nonstationary model,
include the observation years (ns_years
), the nonstationary model structure
(ns_structure
), and a list of years at which to compute the return level estimates
and confidence intervals (ns_slices
).
Usage
uncertainty_bootstrap(
data,
distribution,
method,
prior = NULL,
ns_years = NULL,
ns_structure = NULL,
ns_slices = NULL,
alpha = 0.05,
samples = 10000L,
periods = c(2, 5, 10, 20, 50, 100)
)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
distribution |
A three-character code indicating the distribution family.
Must be |
method |
Character scalar specifying the estimation method.
Must be |
prior |
Numeric vector of length 2. Specifies the parameters of the
Beta prior for the shape parameter |
ns_years |
For NS-FFA only: Numeric vector of observation years corresponding
to |
ns_structure |
For NS-FFA only: Named list indicating which distribution parameters are modeled as nonstationary. Must contain two logical scalars:
|
ns_slices |
For NS-FFA only: Numeric vector specifying the years at which to
evaluate the return levels confidence intervals of a nonstationary probability
distribution. |
alpha |
Numeric scalar in |
samples |
Integer scalar. The number of bootstrap samples. Default is 10000. |
periods |
Numeric vector used to set the return periods for FFA. All entries must be greater than or equal to 1. |
Details
Bootstrap samples are obtained from the fitted distribution via inverse transform
sampling. For each bootstrapped sample, the parameters are re-estimated based on the
method
argument. Then, the bootstrapped parameters are used to compute a new set of
bootstrapped quantiles. Confidence intervals are obtained from the empirical
nonexceedance probabilities of the bootstrapped quantiles.
Value
A list containing the following six items:
-
method
: "Bootstrap" -
distribution
: Thedistribution
argument. -
params
: The fitted parameters. -
ns_structure
: Thens_structure
argument, if given. -
ns_slices
: Thens_slices
argument, if given. -
ci
: A dataframe containing confidence intervals (S-FFA only) -
ci_list
: A list of dataframes containing confidence intervals (NS-FFA only).
The dataframe(s) in ci
and ci_list
have four columns:
-
estimates
: Estimated quantiles for each return period. -
lower
: Lower bound of the confidence interval for each return period. -
upper
: Upper bound of the confidence interval for each return period. -
periods
: Theperiods
argument.
Note
The parametric bootstrap is known to give unreasonably wide confidence intervals
for small datasets. If this method yields a confidence interval that is at least
5 times greater than the magnitude of the return levels, it will return an error
and recommend uncertainty_rfpl()
or uncertainty_rfgpl()
as alternatives.
References
Vidrio-Sahagún, C.T., He, J. Enhanced profile likelihood method for the nonstationary hydrological frequency analysis, Advances in Water Resources 161, 10451 (2022). doi:10.1016/j.advwatres.2022.104151
See Also
fit_lmoments()
, fit_mle()
, fit_gmle()
, utils_sample_lmoments()
utils_quantiles()
, plot_sffa_estimates()
, plot_nsffa_estimates()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
uncertainty_bootstrap(data, "WEI", "L-moments")
Regula-Falsi Generalized Profile Likelihood Uncertainty Quantification
Description
Calculates return level estimates and confidence intervals at specified return periods (defaults to 2, 5, 10, 20, 50, and 100 years) using the regula-falsi generalized profile likelihood root‐finding method for the GEV distribution.
For NS-FFA: To perform uncertainty quantification for a nonstationary model,
include the observation years (ns_years
), the nonstationary model structure
(ns_structure
), and a list of years at which to compute the return level estimates
and confidence intervals (ns_slices
).
Usage
uncertainty_rfgpl(
data,
prior,
ns_years = NULL,
ns_structure = NULL,
ns_slices = NULL,
alpha = 0.05,
periods = c(2, 5, 10, 20, 50, 100),
tolerance = 0.01
)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
prior |
Numeric vector of length 2. Specifies the parameters of the
Beta prior for the shape parameter |
ns_years |
For NS-FFA only: Numeric vector of observation years corresponding
to |
ns_structure |
For NS-FFA only: Named list indicating which distribution parameters are modeled as nonstationary. Must contain two logical scalars:
|
ns_slices |
For NS-FFA only: Numeric vector specifying the years at which to
evaluate the return levels confidence intervals of a nonstationary probability
distribution. |
alpha |
Numeric scalar in |
periods |
Numeric vector used to set the return periods for FFA. All entries must be greater than or equal to 1. |
tolerance |
The log-likelihood tolerance for Regula-Falsi convergence (default is 0.01). |
Details
Uses
fit_gmle()
to obtain the maximum generalized log‐likelihood.Defines an objective function
f(y_p, p)
by reparameterizing the generalized log-likelihood.Iteratively brackets the root by rescaling initial guesses by 0.05 until
f(y_p, p)
changes sign.Uses the regula-falsi method to solve
f(y_p, p) = 0
for each return period probability.Returns lower and upper confidence bounds at significance level
alpha
.
Value
A list containing the following six items:
-
method
: "RFGPL" -
distribution
: "GEV" -
params
: The fitted parameters. -
ns_structure
: Thens_structure
argument, if given. -
ns_slices
: Thens_slices
argument, if given. -
ci
: A dataframe containing confidence intervals (S-FFA only) -
ci_list
: A list of dataframes containing confidence intervals (NS-FFA only).
The dataframe(s) in ci
and ci_list
have four columns:
-
estimates
: Estimated quantiles for each return period. -
lower
: Lower bound of the confidence interval for each return period. -
upper
: Upper bound of the confidence interval for each return period. -
periods
: Theperiods
argument.
Note
RFGPL uncertainty quantification can be numerically unstable for some datasets.
If this function encounters an issue, it will return an error and recommend
uncertainty_bootstrap()
instead.
References
Vidrio-Sahagún, C.T., He, J. Enhanced profile likelihood method for the nonstationary hydrological frequency analysis, Advances in Water Resources 161, 10451 (2022). doi:10.1016/j.advwatres.2022.104151
Vidrio-Sahagún, C.T., He, J. & Pietroniro, A. Multi-distribution regula-falsi profile likelihood method for nonstationary hydrological frequency analysis. Stochastic Environmental Research and Risk Assessment 38, 843–867 (2024). doi:10.1007/s00477-023-02603-0
See Also
utils_quantiles()
, uncertainty_bootstrap()
, uncertainty_rfpl()
,
plot_sffa_estimates()
, plot_nsffa_estimates()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
uncertainty_rfgpl(data, c(6, 9))
Regula-Falsi Profile Likelihood Uncertainty Quantification
Description
Calculates return level estimates and confidence intervals at specified return periods (defaults to 2, 5, 10, 20, 50, and 100 years) using the regula-falsi profile likelihood root‐finding method.
For NS-FFA: To perform uncertainty quantification for a nonstationary model,
include the observation years (ns_years
), the nonstationary model structure
(ns_structure
), and a list of years at which to compute the return level estimates
and confidence intervals (ns_slices
).
Usage
uncertainty_rfpl(
data,
distribution,
ns_years = NULL,
ns_structure = NULL,
ns_slices = NULL,
alpha = 0.05,
periods = c(2, 5, 10, 20, 50, 100),
tolerance = 0.01
)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
distribution |
A three-character code indicating the distribution family.
Must be |
ns_years |
For NS-FFA only: Numeric vector of observation years corresponding
to |
ns_structure |
For NS-FFA only: Named list indicating which distribution parameters are modeled as nonstationary. Must contain two logical scalars:
|
ns_slices |
For NS-FFA only: Numeric vector specifying the years at which to
evaluate the return levels confidence intervals of a nonstationary probability
distribution. |
alpha |
Numeric scalar in |
periods |
Numeric vector used to set the return periods for FFA. All entries must be greater than or equal to 1. |
tolerance |
The log-likelihood tolerance for Regula-Falsi convergence (default is 0.01). |
Details
Uses
fit_mle()
to obtain the maximum log‐likelihood.Defines an objective function
f(y_p, p)
by reparameterizing the log-likelihood.Iteratively brackets the root by rescaling initial guesses by 0.05 until
f(y_p, p)
changes sign.Uses the regula-falsi method to solve
f(y_p, p) = 0
for each return period probability.Returns lower and upper confidence bounds at significance level
alpha
.
Value
A list containing the following four items:
-
method
: "RFPL" -
distribution
: Thedistribution
argument. -
params
: The fitted parameters. -
ns_structure
: Thens_structure
argument, if given. -
ns_slices
: Thens_slices
argument, if given. -
ci
: A dataframe containing confidence intervals (S-FFA only) -
ci_list
: A list of dataframes containing confidence intervals (NS-FFA only).
The dataframe(s) in ci
and ci_list
have four columns:
-
estimates
: Estimated quantiles for each return period. -
lower
: Lower bound of the confidence interval for each return period. -
upper
: Upper bound of the confidence interval for each return period. -
periods
: Theperiods
argument.
Note
RFPL uncertainty quantification can be numerically unstable for some datasets.
If this function encounters an issue, it will return an error and recommend
using the parametric bootstrap method uncertainty_bootstrap()
instead.
References
Vidrio-Sahagún, C.T., He, J. Enhanced profile likelihood method for the nonstationary hydrological frequency analysis, Advances in Water Resources 161, 10451 (2022). doi:10.1016/j.advwatres.2022.104151
Vidrio-Sahagún, C.T., He, J. & Pietroniro, A. Multi-distribution regula-falsi profile likelihood method for nonstationary hydrological frequency analysis. Stochastic Environmental Research and Risk Assessment 38, 843–867 (2024). doi:10.1007/s00477-023-02603-0
See Also
utils_quantiles()
, uncertainty_bootstrap()
, uncertainty_rfgpl()
,
plot_sffa_estimates()
, plot_nsffa_estimates()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
uncertainty_rfpl(data, "GLO")
Cumulative Distribution Functions for Probability Models
Description
Compute probabilities from quantiles for both stationary and nonstationary models.
For NS-FFA: To compute the probabilities for a nonstationary model, specify a
time slice (ns_slice
) and the nonstationary model structure (ns_structure
).
Usage
utils_cdf(q, distribution, params, ns_slice = 0, ns_structure = NULL)
Arguments
q |
Numeric vector of quantiles with no missing values. |
distribution |
A three-character code indicating the distribution family.
Must be |
params |
Numeric vector of distribution parameters, in the order (location,
scale, shape). The length must be between 2 and 5, depending on the specified
|
ns_slice |
For NS-FFA only: Numeric scalar specifying the year at which to
evaluate the quantiles of a nonstationary probability distribution. |
ns_structure |
For NS-FFA only: Named list indicating which distribution parameters are modeled as nonstationary. Must contain two logical scalars:
|
Value
A numeric vector of quantiles with the same length as q
.
Examples
q <- seq(1, 10)
params <- c(1, 1, 1)
utils_cdf(q, "GEV", params)
Generalized Log-Likelihood Functions for GEV Models
Description
Computes the generalized log-likelihood for stationary and nonstationary variants of the Generalized Extreme Value (GEV) distribution with a geophysical (Beta) prior distribution for the shape parameter (Martins and Stedinger, 2000).
For NS-FFA: To compute the generalized log-likelihood for a nonstationary
probability model, include the observation years (ns_years
) and the nonstationary
model structure (ns_structure
).
Usage
utils_generalized_likelihood(
data,
params,
prior,
ns_years = NULL,
ns_structure = NULL
)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
params |
Numeric vector of distribution parameters, in the order (location,
scale, shape). The length must be between 2 and 5, depending on the specified
|
prior |
Numeric vector of length 2. Specifies the parameters of the
Beta prior for the shape parameter |
ns_years |
For NS-FFA only: Numeric vector of observation years corresponding
to |
ns_structure |
For NS-FFA only: Named list indicating which distribution parameters are modeled as nonstationary. Must contain two logical scalars:
|
Details
The generalized log-likelihood is defined as sum of (1) the log-likelihood and (2)
the log-density of the Beta prior with parameters (p, q)
. The contribution of
the prior is:
\log \pi(\kappa) = (p-1) \log(0.5-\kappa) + (q-1) \log(0.5+\kappa)
- \log (\beta(p, q))
Value
Numeric scalar. The generalized log-likelihood value.
References
El Adlouni, S., Ouarda, T.B.M.J., Zhang, X., Roy, R., Bobee, B., 2007. Generalized maximum likelihood estimators for the nonstationary generalized extreme value model. Water Resources Research 43 (3), 1–13. doi:10.1029/2005WR004545
Martins, E. S., and Stedinger, J. R. (2000). Generalized maximum-likelihood generalized extreme-value quantile estimators for hydrologic data. Water Resources Research, 36(3), 737–744. doi:10.1029/1999WR900330
See Also
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
params <- c(100, 10, 0.1)
prior <- c(1, 1)
# Compute the generalized log-likelihood
utils_generalized_likelihood(data, params, prior)
Log-Likelihood Functions for Probability Models
Description
Compute the log-likelihood for stationary and nonstationary probability models.
For NS-FFA: To compute the log-likelihood for a nonstationary probability model,
include the observation years (ns_years
) and the nonstationary model structure
(ns_structure
).
Usage
utils_log_likelihood(
data,
distribution,
params,
ns_years = NULL,
ns_structure = NULL
)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
distribution |
A three-character code indicating the distribution family.
Must be |
params |
Numeric vector of distribution parameters, in the order (location,
scale, shape). The length must be between 2 and 5, depending on the specified
|
ns_years |
For NS-FFA only: Numeric vector of observation years corresponding
to |
ns_structure |
For NS-FFA only: Named list indicating which distribution parameters are modeled as nonstationary. Must contain two logical scalars:
|
Value
Numeric scalar. The log-likelihood value.
See Also
utils_generalized_likelihood()
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
params <- c(100, 1, 10)
ns_years <- seq(from = 1901, to = 2000)
ns_structure <- list(location = TRUE, scale = FALSE)
# Compute the log-likelihood
utils_log_likelihood(data, "NOR", params, ns_years, ns_structure)
Quantile Functions for Probability Models
Description
Compute the quantiles for stationary and nonstationary probability models.
For NS-FFA: To compute the quantiles for a nonstationary probability model,
specify a time slice (ns_slice
) and the nonstationary model structure
(ns_structure
).
Usage
utils_quantiles(p, distribution, params, ns_slice = 0, ns_structure = NULL)
Arguments
p |
Numeric vector of probabilities between 0 and 1 with no missing values. |
distribution |
A three-character code indicating the distribution family.
Must be |
params |
Numeric vector of distribution parameters, in the order (location,
scale, shape). The length must be between 2 and 5, depending on the specified
|
ns_slice |
For NS-FFA only: Numeric scalar specifying the year at which to
evaluate the quantiles of a nonstationary probability distribution. |
ns_structure |
For NS-FFA only: Named list indicating which distribution parameters are modeled as nonstationary. Must contain two logical scalars:
|
Value
A numeric vector of quantiles with the same length as p
.
Examples
p <- runif(n = 100)
params <- c(1, 1, 1)
utils_quantiles(p, "GEV", params)
Sample L-moments
Description
Computes the first four sample L-moments and L-moment ratios from a numeric vector of data. L-moments are linear combinations of order statistics that provide robust alternatives to conventional moments, with advantages in parameter estimation for heavy-tailed and skewed distributions.
Usage
utils_sample_lmoments(data)
Arguments
data |
Numeric vector of observed annual maximum series values. Must be strictly positive, finite, and not missing. |
Details
Given probability weighted moments \beta_0, \beta_1, \beta_2, \beta_3
,
the first four sample L-moments are:
-
l_1 = \beta_0
-
l_2 = 2\beta_1 - \beta_0
-
l_3 = 6\beta_2 - 6\beta_1 + \beta_0
-
l_4 = 20\beta_3 - 30\beta_2 + 12\beta_1 - \beta_0
Then, the sample L-skewness is t_3 = l_3 / l_2
and the sample L-kurtosis
is t_4 = l_4 / l_2
.
Value
A numeric vector containing the first four sample L-moments and L-moment ratios:
-
l_1
: L-mean -
l_2
: L-variance -
t_3
: L-skewness -
t_4
: L-kurtosis
References
Hosking, J. R. M. (1990). L-moments: Analysis and estimation of distributions using linear combinations of order statistics. Journal of the Royal Statistical Society: Series B (Methodological), 52(1), 105–124.
Examples
data <- rnorm(n = 100, mean = 100, sd = 10)
utils_sample_lmoments(data)
Theoretical L-moments of Probability Distributions
Description
Computes the first four L-moments and L-moment ratios for stationary probability models.
Usage
utils_theoretical_lmoments(distribution, params)
Arguments
distribution |
A three-character code indicating the distribution family.
Must be |
params |
Numeric vector of distribution parameters, in the order (location,
scale, shape). The length must be between 2 and 5, depending on the specified
|
Details
The distributions "GUM"
, "NOR"
, "GEV"
, "GLO"
, and "WEI"
have
closed-form solutions for the L-moments and L-moment ratios in terms of the parameters.
The distributions "GNO"
and "PE3"
use rational approximations of the L-moment ratios
from Hosking (1997). The L-moments ratios for the "LNO"
and "LP3"
distributions
are should be compared to the log-transformed data and are thus identical to the "NOR"
and "PE3"
distributions respectively.
Value
A numeric vector of with four elements:
-
\lambda_1
: L-mean -
\lambda_2
: L-variance -
\tau_3
: L-skewness -
\tau_4
: L-kurtosis
References
Hosking, J.R.M. & Wallis, J.R., 1997. Regional frequency analysis: an approach based on L-Moments. Cambridge University Press, New York, USA.
Examples
utils_theoretical_lmoments("GEV", c(1, 1, 1))