Univariate analysis for discrete risk factors in an insurance portfolio. The following summary statistics are calculated:
frequency (i.e. number of claims / exposure)
average severity (i.e. severity / number of claims)
risk premium (i.e. severity / exposure)
loss ratio (i.e. severity / premium)
average premium (i.e. premium / exposure)
If input arguments are not specified, the summary statistics related to these arguments are ignored.
Usage
univariate(
df,
x,
severity = NULL,
nclaims = NULL,
exposure = NULL,
premium = NULL,
by = NULL
)
Arguments
- df
data.frame with insurance portfolio
- x
column in
df
with risk factor, or usevec_ext()
for use with an external vector (see examples)- severity
column in
df
with severity (default is NULL)- nclaims
column in
df
with number of claims (default is NULL)- exposure
column in
df
with exposure (default is NULL)column in
df
with premium (default is NULL)- by
list of column(s) in
df
to group by
Examples
# Summarize by `area`
univariate(MTPL2, x = area, severity = amount, nclaims = nclaims,
exposure = exposure, premium = premium)
#> # A tibble: 4 × 10
#> area amount nclaims exposure premium frequency average_severity risk_premium
#> <int> <int> <int> <dbl> <int> <dbl> <dbl> <dbl>
#> 1 2 4063270 98 819. 51896 0.120 41462. 4964.
#> 2 3 7945311 113 765. 49337 0.148 70312. 10386.
#> 3 1 6896187 146 1066. 65753 0.137 47234. 6471.
#> 4 0 6922 1 13.3 902 0.0751 6922 520.
#> # ℹ 2 more variables: loss_ratio <dbl>, average_premium <dbl>
# Summarize by `area`, with column name in external vector
xt <- "area"
univariate(MTPL2, x = vec_ext(xt), severity = amount, nclaims = nclaims,
exposure = exposure, premium = premium)
#> # A tibble: 4 × 10
#> area amount nclaims exposure premium frequency average_severity risk_premium
#> <int> <int> <int> <dbl> <int> <dbl> <dbl> <dbl>
#> 1 2 4063270 98 819. 51896 0.120 41462. 4964.
#> 2 3 7945311 113 765. 49337 0.148 70312. 10386.
#> 3 1 6896187 146 1066. 65753 0.137 47234. 6471.
#> 4 0 6922 1 13.3 902 0.0751 6922 520.
#> # ℹ 2 more variables: loss_ratio <dbl>, average_premium <dbl>
# Summarize by `zip` and `bm`
univariate(MTPL, x = zip, severity = amount, nclaims = nclaims,
exposure = exposure, by = bm)
#> # A tibble: 84 × 8
#> zip bm amount nclaims exposure frequency average_severity risk_premium
#> <fct> <int> <dbl> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 1 5 4938135 82 550. 0.149 60221. 8983.
#> 2 1 3 3623485 86 614. 0.140 42134. 5902.
#> 3 2 8 1739654 38 249. 0.152 45780. 6981.
#> 4 1 10 2077041 73 451. 0.162 28453. 4601.
#> 5 3 1 20064123 381 2841. 0.134 52662. 7062.
#> 6 3 6 3814492 82 539. 0.152 46518. 7081.
#> 7 3 2 11182348 179 1282. 0.140 62471. 8726.
#> 8 2 1 25368747 356 2944. 0.121 71261. 8617.
#> 9 1 2 17512277 287 1835. 0.156 61018. 9542.
#> 10 2 9 574527 25 237. 0.106 22981. 2428.
#> # ℹ 74 more rows
# Summarize by `zip`, `bm` and `power`
univariate(MTPL, x = zip, severity = amount, nclaims = nclaims,
exposure = exposure, by = list(bm, power))
#> # A tibble: 3,290 × 9
#> zip bm power amount nclaims exposure frequency average_severity
#> <fct> <int> <int> <dbl> <int> <dbl> <dbl> <dbl>
#> 1 1 5 106 0 0 1 0 NaN
#> 2 1 3 74 2687 1 14.1 0.0707 2687
#> 3 2 8 65 0 0 5 0 NaN
#> 4 1 10 64 0 0 7 0 NaN
#> 5 3 1 29 37784 3 21.9 0.137 12595.
#> 6 3 6 66 114021 2 27.6 0.0726 57010.
#> 7 3 2 43 1382215 11 61.3 0.180 125656.
#> 8 3 2 55 764498 27 146. 0.185 28315.
#> 9 3 1 100 3405 1 14.2 0.0703 3405
#> 10 3 2 66 929945 15 97.1 0.154 61996.
#> # ℹ 3,280 more rows
#> # ℹ 1 more variable: risk_premium <dbl>