| Title: | Estimation of a Lognormal - Generalized Pareto Mixture |
|---|---|
| Description: | Estimation of a lognormal - Generalized Pareto mixture via the Expectation-Maximization algorithm. Computation of bootstrap standard errors is supported and performed via parallel computing. Functions for random number simulation and density evaluation are also available. For more details see Bee and Santi (2025) <doi:10.48550/arXiv.2505.22507>. |
| Authors: | Marco Bee [aut, cre] (ORCID: <https://orcid.org/0000-0002-9579-3650>) |
| Maintainer: | Marco Bee <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-05-20 09:00:53 UTC |
| Source: | https://github.com/marco-bee/logngpd |
This function evaluates the lognormal-GPD mixture density function.
dlognGPD(x, p, mu, sigma, xi, beta)dlognGPD(x, p, mu, sigma, xi, beta)
x |
vector (nx1): points where the function is evaluated. |
p |
real, 0<p<1: prior probability |
mu |
real: log-mean of the truncated lognormal distribution. |
sigma |
positive real: log-standard deviation of the truncated lognormal distribution. |
xi |
real: shape parameter of the generalized Pareto distribution. |
beta |
positive real: scale parameter of the generalized Pareto distribution. |
ydens (n x 1) vector: numerical values of the lognormal - generalized Pareto mixture at x.
ydens <- dlognGPD(seq(0,20,length.out=500),.9,0,1,0.5,2)ydens <- dlognGPD(seq(0,20,length.out=500),.9,0,1,0.5,2)
This function evaluates the density of the continuous and differentiable version of the truncated lognormal-Pareto spliced distribution proposed by Scollnik (2007).
dlognPareto(x, sigma, xmin, alpha)dlognPareto(x, sigma, xmin, alpha)
x |
vector (nx1): points where the function is evaluated. |
sigma |
positive real: log-standard deviation of the truncated lognormal distribution. |
xmin |
positive real: scale parameter of the Pareto distribution. |
alpha |
positive real: shape parameterof the Pareto distribution. |
To get a continuous and differentiable density, it is necessary to enforce constraints that reduce the number of free parameters of the model; in particular, the mixing weight and the log-mean of the lognormal distirbution are functions of the reamining parameters. See Scollnik (2007) for details.
ysim (n x 1) vector: numerical values of the truncated lognormal-Pareto spliced distribution at x.
Scollnik DPM (2007). “On composite lognormal-Pareto models.” Scandinavian Actuarial Journal, 1, 20-33.
ysim <- dlognPareto(seq(0,20,length.out=500),1,5,2)ysim <- dlognPareto(seq(0,20,length.out=500),1,5,2)
This function draws a bootstrap sample and uses it to estimate the parameters of a lognormal-Pareto mixture distribution. Since this is typically called by LPfitEM, see the help of LPfitEM for examples.
EMBoot(x, x0, y, maxiter)EMBoot(x, x0, y, maxiter)
x |
list: sequence of integers 1,...,K, where K is the mumber of datasets. Set x = 1 in case of a single dataset. |
x0 |
numerical vector (5x1): initial values of the parameters p,
|
y |
numerical vector: observed sample. |
maxiter |
non-negative integer: maximum number of iterations of the EM algorithm. |
At each bootstrap replication, the mixture is estimated via the EM algorithm.
Estimated parameters obtained from a bootstrap sample.
This function estimates a static lognormal - generalized Pareto mixture by means of the EM algorithm. Optionally, bootstrap standard errors are computed via parallel computing.
EMlogngpdmix(x0, y, maxiter, nboot = 0)EMlogngpdmix(x0, y, maxiter, nboot = 0)
x0 |
numerical vector (5x1): initial values of the parameters p,
|
y |
vector: observed data. |
maxiter |
positive integer: maximum number of iterations of the EM algorithm. |
nboot |
positive integer: number of bootstrap replications for the computation of the standard errors (defaults to 0). |
A list with the following elements is returned:
"p" = estimated value of p,
"post" = posterior probabilities of all observations,
"mu" = estimated value of ,
"sigma " = estimated value of ,
"xi" = estimated value of ,
"beta" = estimated value of ,
"loglik" = maximimzed log-likelihood,
"nit" = number of iterations,
bootEst = matrix of parameter estimates at each bootstrap replications (only if nboot > 0).
bootStd = bootstrap standard errors of each parameter (only if nboot > 0).
y <- rlognGPD(100,.9,0,1,0.5,2) x0 <- c(.7,.2,1.3,.8,1.7) res <- EMlogngpdmix(x0, y, 1000)y <- rlognGPD(100,.9,0,1,0.5,2) x0 <- c(.7,.2,1.3,.8,1.7) res <- EMlogngpdmix(x0, y, 1000)
This function simulates a lognormal-GPD mixture.
rlognGPD(n, p, mu, sigma, xi, beta)rlognGPD(n, p, mu, sigma, xi, beta)
n |
positive integer: number of observations sampled. |
p |
real, 0<p<1: prior probability |
mu |
real: log-mean of the lognormal distribution. |
sigma |
positive real: log-standard deviation of the lognormal distribution. |
xi |
real: shape parameter of the generalized Pareto distribution. |
beta |
positive real: scale parameter of the generalized Pareto distribution. |
ysim (n x 1) vector: n random numbers from the lognormal - generalized Pareto mixture.
ysim <- rlognGPD(100,.9,0,1,0.5,2)ysim <- rlognGPD(100,.9,0,1,0.5,2)
This function simulates the continuous and differentiable version of the truncated lognormal-Pareto spliced distribution proposed by Scollnik (2007).
rlognPareto(n, sigma, xmin, alphapar)rlognPareto(n, sigma, xmin, alphapar)
n |
positive integer: number of observations sampled. |
sigma |
positive real: log-standard deviation of the truncated lognormal distribution. |
xmin |
positive real: scale parameter of the Pareto distribution. |
alphapar |
positive real: shape parameterof the Pareto distribution. |
See Scollnik (2007) for details.
ysim (nreps x 1) vector: nreps random numbers from the truncated lognormal-Pareto spliced distribution.
Scollnik DPM (2007). “On composite lognormal-Pareto models.” Scandinavian Actuarial Journal, 1, 20-33.
ysim <- rlognPareto(100,1,5,2)ysim <- rlognPareto(100,1,5,2)
This function evaluates the zero-mean generalized Pareto log-likelihood function computed with weighted observations.
weiGpdLik(x, y, post)weiGpdLik(x, y, post)
x |
numerical vector (2x1): values of the parameters |
y |
numerical vector (nx1): observed data. |
post |
numerical vector (nx1) with elements in (0,1): weights of the observations (in the EM algorithm, posterior probabilities). |
llik real: numerical value of the log-likelihood function
y <- rlognGPD(100,.9,0,1,0.5,2) x0 <- c(.7,.2,1.3,.8,1.7) res <- EMlogngpdmix(x0, y, 1000) llik <- weiGpdLik(c(res$beta,res$xi),y,res$post)y <- rlognGPD(100,.9,0,1,0.5,2) x0 <- c(.7,.2,1.3,.8,1.7) res <- EMlogngpdmix(x0, y, 1000) llik <- weiGpdLik(c(res$beta,res$xi),y,res$post)