Data Driven Identification of Power Plant Operation States Using Clustering

Outline

EfeMOD Project

Motivation and Objective

Data

Empirical Approach

Results

Conclusion / Outlook

EfeMOD

Empirisch fundierte Elektrizitätsmarkt-Modellierung mit Open Data

Project Entities:

Chair of Prof. Dr. Christoph Weber (Management Sciences and Energy Economics)

Chair of Prof. Dr. Florian Ziel (Data Science in Energy and Environment)

Project Goal:

Use publicly available data (particularly ENTSO-E Transparency Platform) to estimate parameters for energy system and energy market models.

EfeMOD

Motivation and Objective

Identification of Power Plant Operation States Using Clustering

Gain Knowledge about the Power Plant Characteristics

Operation Points,
Efficiency
Capacity, etc.

This Presentation:

Identify Operation States:

Stable Operation
Startup
Minimum-Stable Operation, etc.

Provide these characteristics to other researchers

e.g. to estimate efficiency

Data

Entsoe Data:

ActualGenerationOutputPerGenerationUnit_16.1.A
UnavailabilityOfGenerationUnits_15.1.A_B

We focus on natural gas units:

63 units in DE_LU bidding zone
299 units across all bidding zones

We use recent data:

2020-01-01 until “now”

Heizkraftwerk Lausward

Location: Düsseldorf

Block Anton (Block AGuD)

Combined cycle gas turbine (CCGT)

Electrical output: 103 MW

75 MW of district heating can be decoupled

Efficiency: 54%

Fuel Utilization Rate: 87% (with district heating)

Erdgaskraftwerk Emsland

Location: Lingen (Ems)

Block C

Combined cycle gas turbine (CCGT)

Electrical output: 475 MW

Efficiency: 46%

Black start enabled.

Empirical Approach

Overview

Empirical identification of states

3-Step Approach:

Prior Partitioning
- We create preliminary clusters
- They will be used to initialize the main clustering
Main Clustering
- Gaussian Model Based Clustering
Label Assignment
- We assign meaningful labels to the final clusters

:::

Empirical Approach

Prior Partitioning

Divide the space in meaningful partitions:

Define the Capacity: \(\zeta = max(t0)\)

Define a threshold: \(\gamma = \frac{\zeta}{50}\)

\(\pm \gamma\) around the diagonal: Stable
\(t0 < 1\) & \(t1 < 1\): Zero
\(t0 < \gamma\) & \(t1 > 1\): Startup
\(t0 > 1\) & \(t1 < \gamma\): Shutdown
\(t1 > t0\): Ramp-Up
\(t1 < t0\): Ramp-Down

We project Stable observations onto the diagonal, Startup on \(t1\) and Shutdown on \(t0\) for the next step.

:::

Empirical Approach

Prior Partitioning

Model-Based Clustering of the Regions using mclust::Mclust in R.

Stable: 2-5 Clusters
Ramp Up: 2-4 Clusters
Ramp Down: 2-4 Clusters

Obtain finite mixture distribution:

\[\sum_{k=1}^{G}{\pi_k f_k (\mathbf{x}; \mathbf{\theta}_k)}\]

\(f_k\) Density of k’s component
\(\pi_k\) Mixture weights
\(\theta_k\) parameters of k’s density component

Empirical Approach

Prior Partitioning

\[f(\mathbf{x}; \mathbf{\Psi}) = \sum_{k=1}^{G}{\pi_k \phi (\mathbf{x}; \mathbf{\mu}_k; \mathbf{\Sigma}_k)}\]

\(\phi(\cdot)\) Multivariate Gaussian density

Maximum Likelihood Estimation via Expectation Maximization (EM) algorithm

Likelihood for Gaussian Mixture Models (GMMs):

\[\begin{align} \ell(\Psi) = \sum_{i=1}^n \log \left\{ \sum_{k=1}^G \pi_k \phi(x_i; \mu_k, \Sigma_k) \right\} \end{align}\]

We Re-Formulate this likelihood to a complete-data likelihood to utilize the EM algorithm

\[\begin{align} \ell_{\mathcal{C}}(\Psi) = \sum_{i=1}^n \sum_{k=1}^G z_{ik} \left\{ \log \pi_k + \log \phi(x_i; \mu_k, \Sigma_k) \right\} \end{align}\]

\[\begin{align} z_{ik} = \begin{cases} 1 & \text{if } x_i \text{ belongs to component }k \\ 0 & \text{otherwise.} \end{cases} \end{align}\]

E-Step:

\[\begin{align} \hat{z}_{ik} = \frac{\hat{\pi}_k \phi(x_i; \hat{\mu}_k, \hat{\Sigma}_k)}{\sum_{g=1}^{G} \hat{\pi}_g \phi(x_i; \hat{\mu}_g, \hat{\Sigma}_g)}, \end{align}\]

M-Step:

\[\begin{align} \quad \hat{\mu}_k = \frac{\sum_{i=1}^{n} \hat{z}_{ik} x_i}{n_k}, \quad \text{where} \quad n_k = \sum_{i=1}^{n} \hat{z}_{ik}. \end{align}\]

Empirical Approach

Prior Partitioning

Initialization

We initialize the EM algorithm (E-Step) using the partitions obtained from model-based agglomerative hierarchical clustering (MBAHC)

Estimation

The Bayesian information criterion (BIC) is used for model selection

Prior Partitioning Results

Right graph shows prior clusters.

Lausward
Emsland

Empirical Approach

Main Clustering

MBAHC

Prior Clusters are used in MBAHC

The results of the MBAHC are used to initialize the EM Algorithm in the main Gaussian Model Based Clustering

Main Clustering Results

Right graph shows Maximum A Posteriori (MAP) Classification

Colour indicates cumulated log(density) of all components.

Lausward
Emsland

Empirical Approach

Label Assignment

We assign labels to the clusters using their mean \(\mu\) and correlation \(\rho\)

Multiple clusters may describe one Generation State (e.g., along the diagonal)

# A tibble: 6 × 4
  classification      mu_t0      mu_t1      cor
           <int>      <dbl>      <dbl>    <dbl>
1              1 -0.0000290 -0.0000338 -0.00562
2              2 33.6       33.8        0.703  
3              3 10.5       48.0        0.795  
4              4 83.2       88.4        0.821  
5              5 82.4       82.4        1.00   
6              6 80.5       80.1        0.978

\[\begin{align} \text{State} = \begin{cases} \color{#202020FF}{\text{Zero}} & (\mu_{t0} < 1) \land (\mu_{t1} < 1), \\ \text{MSO} & \left[ (\mu_{t0} > \zeta/10) \land (\mu_{t1} > \zeta / 10) \land (\right| \mu_{t0} - \mu_{t1} \left| > \zeta / 10) \right]\\ & \rightarrow \operatorname{argmin}(\mu_{t0} + \mu_{t1}), \\ \text{Max Capacity} & \rightarrow \operatorname{argmax}(\mu_{t0} + \mu_{t1}), \\ \text{Startup} & (\mu_{t1} \geq \zeta / 10) \land (\mu_{t0} < \gamma) \land (\rho < 0.3), \\ \text{Shutdown} & (\mu_{t0} \geq \zeta / 10) \land (\mu_{t1} < \gamma) \land (\rho < 0.3), \\ \text{Stable Operation} & \text{Remaining clusters with cor} > 0.8, \\ \text{Ramp Up} & \text{Remaining clusters: } \mu_{t1} > \mu_{t0}, \\ \text{Ramp Down} & \text{Remaining clusters: } \mu_{t1} < \mu_{t0}. \end{cases} \end{align}\]

Empirical Approach

Label Assignment

Right graphs show assigned states

The points are coloured according to

MAP
Probability (each pure colour reflects a probability of 1)

Some points below /above the diagonal are assigned to Ramp Up / Ramp Down

Can be easily fixed for MAP
Fixing probabilistic predictions not that easy

LSW
LSW Pr
LSW Pr
EMS
EMS Pr
EMS Pr

Empirical Approach

Label Assignment

Fixing assignments

Relabeling Ramp Up and Ramp Down MAP predictions is trivial:

\[\begin{align} \text{State} = \begin{cases} \text{Ramp Up} & x_{t1} > x_{t0}, \\ \text{Ramp Down} & x_{t1} < x_{t0}. \end{cases} \end{align}\]

Fixing the probability array is more involved:

Find observations \(x_{t1} < x_{t0}\) that can not be “Ramp Up”:

Set probability of all Ramp Up clusters to \(0\).

Normalize the probabilities.

LSW Pr
LSW Pr
EMS Pr
EMS Pr

Outlook

The approach works in general
Conceptually simple
Label assignment needs some more work
Probabilistic statements may need adjustments for Ramp-Up Ramp-Down predictions
Some kind of validation would be desirable
Results will be used party on another research project in the EFEMOD project