class: center, middle, inverse, title-slide .title[ # CRPS-Learning ] .subtitle[ ## Jonathan Berrisch, Florian Ziel ] .author[ ### University of Duisburg-Essen ] .date[ ### 2022-06-04 ] --- name: motivation # Outline </br> </br> 1. Multivariate CRPS Learning - Smoothing procedures - Application to multivariate electricity price forecasts 2. The `profoc` R package - Package overview - Implementation details - Illustrative examples --- # Multivariate CRPS Learning .pull-left[ We extend most objects by an additional dimension We utilize the mean Pinball Score over the entire space for hyperparameter optimization (e.g, `\(\lambda\)`) Additionally, we extend the **B-Smooth** and **P-Smooth** procedures to the multivariate setting: - Basis matrices for reducing - - the probabilistic dimension from P to l - - the multivariate dimension from D to m - Hat matrices - - penalized smoothing across P and D dimensions ] .pull-right[ *Basis Smoothing* Represent weights as linear combinations of bounded basis functions: `\begin{align} \underbrace{w_{t,k}}_{D \text{ x } P} = \underbrace{\boldsymbol \varphi^{mult}}_{D\text{ x }m} \underbrace{\boldsymbol \beta_{t,k}'}_{m\text{ x }l} \underbrace{\boldsymbol \varphi^{prob}}_{l\text{ x }P} \end{align}` A popular choice: B-Splines `\(\boldsymbol \beta_{t,k}\)` is calculated using a reduced regret matrix: `\(\underbrace{\boldsymbol r_{t,k}}_{\text{l x m}} = \underbrace{\boldsymbol \varphi^{\text{prob}}}_{\text{lxP}} \underbrace{\left({\boldsymbol{QL}}_{\mathcal{P}}^{\nabla}(\widetilde{\boldsymbol X}_{t},Y_t)- {\boldsymbol{QL}}_{\mathcal{P}}^{\nabla}(\widehat{\boldsymbol X}_{t},Y_t)\right)}_{\text{PxD}}\underbrace{\boldsymbol \varphi^{\text{mult}}}_{\text{Dxm}}\)`
`\(\boldsymbol r_{t}\)` is transformed from PxK to LxK If `\(l = P\)` it holds that `\(\boldsymbol \varphi^{prob} = \boldsymbol{I}\)` (pointwise) For `\(l = 1\)` we receive constant weights ] --- # Multivariate CRPS Learning .pull-left[ Penalized cubic B-Splines for smoothing weights: Let `\(\varphi=(\varphi_1,\ldots, \varphi_L)\)` be bounded basis functions on `\((0,1)\)` Then we approximate `\(w_{t,k}\)` by `\begin{align} w_{t,k}^{\text{smooth}} = \sum_{l=1}^L \beta_l \varphi_l = \beta'\varphi \end{align}` with parameter vector `\(\beta\)`. The latter is estimated to penalize `\(L_2\)`-smoothing which minimizes `\begin{equation} \| w_{t,k} - \beta' \varphi \|^2_2 + \lambda \| \mathcal{D}^{d} (\beta' \varphi) \|^2_2 \label{eq_function_smooth} \end{equation}` with differential operator `\(\mathcal{D}\)` Computation is easy since we have an analytical solution and we can precompute the hat matrix. ] .pull-right[ </br> </br> **Multivariate smoothing core**: </br> for( t in 1:T ) { for( d in 1:D ) { .grey[...] `\(\boldsymbol w_{t,k} = \mathcal{H^{mult}} \boldsymbol \varphi^{mult} \beta'_{t,k} \varphi^{prob} \mathcal{H^{prob}}\)` } .grey[\#h] } .grey[\#t] ] --- name: application # Application to Multivariate Settings .pull-left[ Apply CRPS-Learning to (24 dimensional) day-ahead power forecasts Forecasts from probabilistic neural networks: Pre-Print to be released soon: Grzegorz Marcjasz, Michał Narajewski, Rafał Weron and Florian Ziel ```r # 182 obs as burn-in mod <- online( y = Y, # 736 x 24 experts = experts, # 736, 24, 99, 2 tau = 1:99 / 100 ) ```
CRPS Values
norm
jsu
comb
1.448
1.372
1.334
] .pull-right[ <div style="position:relative; margin-top:50px; z-index: 0">
] --- # Smoothing in Multivariate Setting .pull-left[ Smoothing parameters can be optimized online Below are parameters chosen for illustrative purposes ```r mod <- online( y = Y, experts = experts, tau = 1:99 / 100, p_smooth_pr = list( lambda = c(20, 50, 100) # 50 selected ), p_smooth_mv = list( lambda = c(20, 50, 100) # 50 selected ) ) ```
CRPS Values
norm
jsu
comb
1.448
1.372
1.333
] .pull-right[ <div style="position:relative; margin-top:50px; z-index: 0">
] --- # The Profoc Package .pull-left[ #### Probabilistic Forecast Combination - profoc Available on [Github](https://github.com/BerriJ/profoc) and [CRAN](https://CRAN.R-project.org/package=profoc) Main Function: `online()` for online learning. - Works with multivariate and/or probabilistic data - Implements BOA, ML-POLY, EWA (and the gradient versions) - Implements many extensions like smoothing, forgetting, thresholding, etc. - Various loss functions are available - Various methods (`predict`, `update`, `plot`, etc.) ] .pull-right[ #### Speed Large parts of profoc are implemented in C++. <center> <img src="profoc_langs.png"> </center> We use `Rcpp`, `RcppArmadillo`, and OpenMP. We use `Rcpp` modules to expose a class to R - Offers great flexibility for the end-user - Requires very little knowledge of C++ code - High-Level interface is easy to use ] --- # Profoc - B-Spline Basis .pull-left[ Basis specification `b_smooth_pr` is internally passed to `make_basis_mats()`: ```r mod <- online( y = Y, experts = experts, tau = 1:99 / 100, b_smooth_pr = list( knots = 9, mu = 0.5, sigma = 1, nonc = 0, tailweight = 1, deg = 3 ) ) ``` Knots are distributed using the generalized beta distribution. ] .pull-right[ Exemplary Basis with 9 Knots: </br> <img src="data:image/png;base64,#index_files/figure-html/unnamed-chunk-8-1.svg" style="display: block; margin: auto;" /> ] --- # Profoc - B-Spline Basis .pull-left[ Basis specification `b_smooth_pr` is internally passed to `make_basis_mats()`: ```r mod <- online( y = Y, experts = experts, tau = 1:99 / 100, b_smooth_pr = list( knots = 9, mu = 0.3, # NEW sigma = 1, nonc = 0, tailweight = 1, deg = 3 ) ) ``` Knots are distributed using the generalized beta distribution. ] .pull-right[ Exemplary Basis with 9 Knots ... ... and `mu` set to `0.3`: <img src="data:image/png;base64,#index_files/figure-html/unnamed-chunk-10-1.svg" style="display: block; margin: auto;" /> ] --- # Profoc - B-Spline Basis .pull-left[ Basis specification `b_smooth_pr` is internally passed to `make_basis_mats()`: ```r mod <- online( y = Y, experts = experts, tau = 1:99 / 100, b_smooth_pr = list( knots = 9, mu = 0.5, sigma = 2, # NEW nonc = 0, tailweight = 1, deg = 3 ) ) ``` Knots are distributed using the generalized beta distribution. ] .pull-right[ Exemplary Basis with 9 Knots ... ... and `sigma = 2`: <img src="data:image/png;base64,#index_files/figure-html/unnamed-chunk-12-1.svg" style="display: block; margin: auto;" /> ] --- # Profoc - Other Online Learning Options .pull-left[ **Forget Regret** Only taking part of the old cumulative regret into account Exponential forgetting of past regret `\begin{align*} R_{t,k} & = R_{t-1,k}(1-\xi) + \ell(\widetilde{F}_{t},Y_i) - \ell(\widehat{F}_{t,k},Y_i) \label{eq_regret_forget} \end{align*}` **Forget Past Performance** Only taking part of the old performance into account **Initialization** Initial Weights Initial Regret ] .pull-right[ **Thresholding** `\begin{align*} \mathcal{F}(x;\phi) & = \phi/K + (1-\phi) x , \\ \mathcal{S}(x;\nu) & = \text{sign}(x)||x|-\nu| , \\ \mathcal{H}(x;\kappa) & = x \mathbb{1}\{|x|>\kappa \} \end{align*}` for some `\(\phi\in[0,1]\)`, `\(\nu\geq0\)` and `\(\kappa\geq0\)`. **Pre Computed Elements** Precomputed Regret (also as a share) Precomputed Loss (also as a share) **Parametercombinations** Profoc considers all parameter combinations Specific combinations can be provided by the user ] --- # Profoc - Customizing Model Specifications .pull-left[ `profoc` also offers a low-level interface: - Can be used to provide custom Basis and Hat matrices - Can be used to manipulate all objects before learning - `online.R` function can be used as a boilerplate. - It also uses the `conline` class which functions as the low-level interface The right side shows an excerpt of `online.R` ] .pull-right[ ```r # Create a new instance of the conline class conline_obj <- new(conline) # Provide the data model_instance$y <- y model_instance$experts <- experts model_instance$tau <- tau # # Basis matrices for probabilistic smoothing tmp <- make_basis_mats( x = 1:P / (P + 1), n = P, mu = 0.5, sigma = 1, nonc = 0, tailw = 1, deg = 1 ) model_instance$basis_pr <- tmp$basis model_instance$params_basis_pr <- tmp$params # Execute online learning model_instance$learn() ``` ] --- # Profoc - Workflow example (1/2) .pull-left[ ```r # For data prep example see ?profoc::online model <- online( y = matrix(y), experts = experts, tau = prob_grid, p_smooth_pr = list( lambda = c(2^-10:10) ) ) print(model) ``` ``` ## Name Loss ## E1 0.4159400 ## E2 1.2095500 ## Combination 0.3528742 ``` ```r plot(model) ``` ] .pull-right[ ![](data:image/png;base64,#index_files/figure-html/unnamed-chunk-17-1.svg)<!-- --> ``` ## Showing Plot 1/2. Press [enter] to continue ``` ![](data:image/png;base64,#index_files/figure-html/unnamed-chunk-17-2.svg)<!-- --> ] --- # Profoc - Workflow example (2/2) .pull-left[ **Predict** ```r new_y <- matrix(rnorm(1)) # Realized new_experts <- experts[T, , , drop = FALSE] # Predict will expand model$predictions model <- predict(model, new_experts = new_experts, update_model = TRUE ) ``` The predict method will never change weights. It appends new_experts to the model if `update_model = TRUE` Corresponding realizations can be provided using `update()`. ] .pull-right[ **Update** `update()` calculates new weights based on new realizations. ```r # Update will update the models weights etc # if you provide new realizations model <- update(model, new_y = new_y ) # Predict can also take new_experts model <- update(model, new_y = new_y, new_experts = new_experts ) # The above updates, and predicts # at the same time ``` ] --- name: conclusion # Wrap-Up .pull-left[ The [
profoc](https://profoc.berrisch.biz/) R Package: Profoc is a flexible framework for online learning. - It implements several algorithms - It implements several loss functions - It implements several extensions - Its high- and low-level interfaces offer great flexibility Profoc is fast. - The core components are written in C++ - The core components utilize OpenMP for parallelization ] .pull-left[ Multivariate Extension: - Code is available now - Pre-Print available later this year Get these slides: .center[ <center> <img src="frame.png"> </center> [https://berrisch.biz/slides/22_06_iccf/](https://berrisch.biz/slides/22_06_iccf/) ] ] <a href="https://github.com/BerriJ" class="github-corner" aria-label="View source on Github"><svg width="80" height="80" viewBox="0 0 250 250" style="fill:#f2f2f2; color:#212121; position: absolute; top: 0; border: 0; right: 0;" aria-hidden="true"><path d="M0,0 L115,115 L130,115 L142,142 L250,250 L250,0 Z"></path><path d="M128.3,109.0 C113.8,99.7 119.0,89.6 119.0,89.6 C122.0,82.7 120.5,78.6 120.5,78.6 C119.2,72.0 123.4,76.3 123.4,76.3 C127.3,80.9 125.5,87.3 125.5,87.3 C122.9,97.6 130.6,101.9 134.4,103.2" fill="currentColor" style="transform-origin: 130px 106px;" class="octo-arm"></path><path d="M115.0,115.0 C114.9,115.1 118.7,116.5 119.8,115.4 L133.7,101.6 C136.9,99.2 139.9,98.4 142.2,98.6 C133.8,88.0 127.5,74.4 143.8,58.0 C148.5,53.4 154.0,51.2 159.7,51.0 C160.3,49.4 163.2,43.6 171.4,40.1 C171.4,40.1 176.1,42.5 178.8,56.2 C183.1,58.6 187.2,61.8 190.9,65.4 C194.5,69.0 197.7,73.2 200.1,77.6 C213.8,80.2 216.3,84.9 216.3,84.9 C212.7,93.1 206.9,96.0 205.4,96.6 C205.1,102.4 203.0,107.8 198.3,112.5 C181.9,128.9 168.3,122.5 157.7,114.1 C157.9,116.9 156.7,120.9 152.7,124.9 L141.0,136.5 C139.8,137.7 141.6,141.9 141.8,141.8 Z" fill="currentColor" class="octo-body"></path></svg></a><style>.github-corner:hover .octo-arm{animation:octocat-wave 560ms ease-in-out}@keyframes octocat-wave{0%,100%{transform:rotate(0)}20%,60%{transform:rotate(-25deg)}40%,80%{transform:rotate(10deg)}}@media (max-width:500px){.github-corner:hover .octo-arm{animation:none}.github-corner .octo-arm{animation:octocat-wave 560ms ease-in-out}}</style>